Online Coaching Started from March 10, 2024 click here for more details!

HTML and XML

1) What is meant by WWW?
World Wide Web (WWW) is actually a collection of traditional Internet access methods (FTP, Gopher, Telnet, etc.) and a new communications method called Hyper Text Transport Protocol (HTTP).
WWW uses the concept of a page for viewing information. Each page is actually a single text files written in something called HyperText Markup Language (HTML). This HTML file is retrieved from a remote computer, known as the HTTP Server, by a WWW browser, and is used to determine
the appearance of that particular WWW page. A HTML document can contain pointers to other HTML documents, graphics, files, sounds, and even descriptions for buttons and other on-screen elements for displaying data. This interconnection of HTML documents on computers all over the Internet, each containing pointers to other HTML documents on other computers on the Internet has created a kind of web of virtual documents and that is why, the term ‘web’ came.

2) Distinguish between Hypertext, Hyperlink and Hyper media.
Hypertext is basically the same as regular text – it can be stored, read, searched, or edited – with an important exception: hypertext contains connections within the text to other documents. When on selection any specific part of document gives access to other document, this is known as hyperlink and this can create a complex virtual web of connections.
Hypermedia is hypertext with a difference – hypermedia documents content links not only to other pieces of text, but also to other forms of media – sounds, images, animation and movies.

3) Define a markup.
The word markup was originally used to describe annotation or other marks within a text intended to instruct a compositor or typist how a particular passage should be printed or laid out.
A ‘markup language’, may be no more than a loose set of markup conventions used together for encoding texts. A markup language must specify what markup is allowed and whereabouts, what markup is required, how markup is to be distinguished from text, and what the markup means.

4) What is HTML? Mention some basic HTML tags.
HTML is a content-based structured markup language where the codes describe what the contents are. Some the basic tags of HTML are Head, Title, Headings and Body.

5) Mention the different types of links that can be created in a HTML document.
Different types links that can be created are: (i) linking of documents in other directories or websites, (ii) linking to specific sections of documents, (iii) linking between sections of different documents, (iv) linking to specific sections of current documents, etc.

6) Why XML is needed over HTML?
eXtensible Markup Language is a kind of markup language.
It has certain advantages over HTML.
XML can carry data.
XML was designed to describe data and to focus on what data is.
HTML is about displaying information. XML is about describing information
XML is extensible. One can define own tags
XML is used to exchange data while it is very difficult with HTML
XML is also considered as meta-language. Thus, XML can be used to create new languages

7) What are semantic tags?
XML was designed to attach semantic to data, i.e., adding context to the data. It does so by allowing to define one’s own tags. For example,

-
 Prolegomena to library classification
-
Ranganathan
S.R.

3rd reprint
Bangalore
Sarada Ranganathan Endowment
640 p.

The example shows the structure of a document, which describes a book, titled Prolegomena to library classification. The book has a title, author, edition, place, publisher, physical description elements. Author is further divided into first name (f_name) and last name (l_name). Inside these tags the actual data is stored. These tags provide context to the whole structure of the document, hence these are known as semantic tags.

8) Describe the library applications of XML.
XML can have implications in library environment. The first and foremost use of XML can be sought in information exchange. As we know that we are sitting on the heap of MARCs, and ironically this heap of standard MARCs has created a kind of non-standardization. In such a condition XML can be used as common platform for information exchange provided at least everyone will have acceptance
to a common set of tags.

XML can also be used in Digital libraries. It can be used for document surrogate as a catalogue. It will be still an ambitious statement to make that XML can beat DBMS (Database Management Systems) and can be a solution for BDBMS (Bibliographic Database Management Systems), on the web a great amount of bibliographic data exchange takes place using XML.

Searching is another area where XML is of great help. As it provides context to search term, searching becomes efficient particularly when we are agreed to follow a set of tags. XML can improve the search efficiency of current search engines. There are projects under development to identify schemas to perform search. RDF (Resource Description Framework) is one initiative in
this direction.

KEYWORDS
Assistive Technologies : Devices used by people with disabilities to access computers. Some assistive technologies include text-to-speech screen readers, alternative keyboards and mice, head pointing devices, voice recognition software, and screen magnification software.
Attribute : A setting for a tag, that affects the way the tag is displayed.
Browser : A program used to access and display web pages. Graphical browsers can display images
and many different text fonts; non-graphical browsers cannot.
CGI : Common Gateway Interface is a way to allow users to provide information to scripts attached
to web pages, usually through forms.
Cyberspace : The imaginary space users of the web move around in. A metaphor that many people take almost literally.
Domain Name : The name of an Internet site, for example www.dell.com or www.indiatimes.com.
Font : A font, strictly speaking, is a set of characters that all belong to the same size and style of a
typeface. For example, Courier.
Forms : The mechanism by which web pages become interactive, allowing users to supply input to
CGI or other scripts.
FTP : File Transfer Protocol, a way to exchange files with other sites on the Internet.
Gopher : A protocol that is older than HTTP and serves a similar purpose, allowing users to tunnel
through cyberspace in search of information.
Graphic : A picture or illustration, also called image.
HTTP : HyperText Transfer Protocol, the conventions used by web browsers and servers to transfer
web pages.
Hypermedia : A combination of hypertext and multimedia that allows users to move in a non-linear fashion through text, images, sounds, and other information.
Hypertext : A collection of documents joined by links so that users can read it in a variety of different
orders.
Image File : A file containing an image.
Indexers : Programs that read pages throughout the web and add a description of their contents to a
database that can be searched by users looking for specific information.
Link : The anchor tag () is used to define both anchors and links. A link is a directive to a
browser. when a user selects a link a new page is loaded. Some people call a link a hotlink or hyperlink.
Multimedia : The combination of several different communications techniques: for example sound,
written text, still pictures, and moving pictures.
Nested : An element that is entirely contained within another element. For example, the phrase ‘the
quick brown fox’ contains a bold element (the word ‘quick’) nested within an italic element
(the entire phrase.) Some browsers will display the word ‘quick’ only as bold, others will display
it as both bold and italic.
Plug-ins : Software programs that enhance other programs or applications on your computer. There
are plugins for Internet browsers, graphics programs, and other applications.
Server : A program running on an Internet site that makes the web pages at that site available to
browsers throughout the Internet.
Site : Internet website.
Tags : Tags are metadata which embeds the information in it.
Unicode : The universal character encoding, maintained by the Unicode Consortium. This encoding
standard provides the basis for processing, storage and interchange of text data in any language in all modern software and ICT protocols. It uses two bytes or 16 bits to code each character.
URI : Uniform Resource Identifier - URIs have been known by many names: WWW addresses,
Universal Document Identifiers, Universal Resource Identifiers, and finally the combination of Uniform Resource Locators (URL) and Names (URN). As far as HTTP is concerned, Uniform Resource Identifiers are simply formatted strings that identify - via name, location, or any other characteristic - a resource.
W3C : An international industry consortium which develops common protocols that promote
WWW evolution and ensure its interoperability. W3C develops interoperable technologies
(specifications, guidelines, software, and tools) to lead the Web to its full potential as a forum
for information, commerce, communication, and collective understanding.

Source: IGNOU Study Material

Tags:
Notes
Link copied to clipboard.