XML: Importance and necessity

The essence of XML is in its name: Extensible Markup Language.

Tom Myer in his piece A Really, Really, Really Good Introduction to XML breaks down XML as the following:


XML is extensible. It lets you define your own tags, the order in which they occur, and how they should be processed or displayed. Another way to think about extensibility is to consider that XML allows all of us to extend our notion of what a document is: it can be a file that lives on a file server, or it can be a transient piece of data that flows between two computer systems (as in the case of Web Services).


The most recognisable feature of XML is its tags, or elements (to be more accurate). In fact, the elements you’ll create in XML will be very similar to the elements you’ve already been creating in your HTML documents. However, XML allows you to define your own set of tags.


XML is a language that’s very similar to HTML. It’s much more flexible than HTML because it allows you to create your own custom tags.

I found that MXL was a simple concept and I needed to be sure that the simplicity that I understood, was not me simply misunderstanding the whole concept. To confirm what I understood to be correct, I decided to read some articles, look at the uses online and watch some tutorial videos explaining more about XML. The one above I found easy to understand and it gave me further understanding of the physical act of writing of XML.

Why XML is important ?

In his article ‘What is XML and Why Should Companies Use It?’,  Alan Pelz-Sharpe speaks about how technology in business has become relatively cheep, so smaller home based businesses an now compete on a larger playing filed, however it stresses the need for these businesses to consider and to use XML from the off. The XML data management system is something that must be at the foundation of data management in all businesses or systems using data management, Smaller businesses who do not store or categorise their data correctly run the risk of losing information which may be necessary to their income. Customer data not categorised correctly can in some cases lose the business owner valuable marketing and contact information.

My XML Learning Task

As I don’t feel that I have enough content to manage and categorise yet, I chose to take a favourite book of mine and see how i would go about giving it some categorisation structure using XML. The following shows the Author, publisher, title and book description. These are identifiers that I have chosen to categorise the book. The identifiers are very basic and used by many online selling platforms worldwide.


<author>”Douglas Adams”</author>
<title>”The Hitchhiker’s Guide to the Galaxy”</title>
<description>Seconds before the Earth is demolished to make way for a galactic freeway, Arthur Dent is plucked off the planet by his friend Ford Prefect, a researcher for the revised edition of The Hitchhiker’s Guide to the Galaxy who, for the last fifteen years, has been posing as an out-of-work actor.
Together this dynamic pair begin a journey through space aided by quotes from The Hitchhiker’s Guide (“A towel is about the most massively useful thing an interstellar hitchhiker can have”) and a galaxy-full of fellow travelers: Zaphod Beeblebrox–the two-headed, three-armed ex-hippie and totally out-to-lunch president of the galaxy; Trillian, Zaphod’s girlfriend (formally Tricia McMillan), whom Arthur tried to pick up at a cocktail party once upon a time zone; Marvin, a paranoid, brilliant, and chronically depressed robot; Veet Voojagig, a former graduate student who is obsessed with the disappearance of all the ballpoint pens he bought over the years.</description>

I have used an XML Validator to check whether my xml language has been written correctly: CodeBeauty

XML test

and it works!

Susan Schreibman – Digital Scholarly Editing

Susan SchreibmanDigitising Scholarly Editing

Schriebman looks at the whole placement of the digital text and the context in which it is set. Unlike the printed format, the covers and tactile nature of the media set the scene for what the text should be. Much like the judging of the book by the cover. Schreibman looks at how the placement in the digital impacts the reader’s perception of the text and how it may be experienced differently to the printed version.

Much like the way in which the text is displayed digitally, Schreibman shows the evolution of a common format and structure. A structure that not only allows the reader to view the texts onscreen digitally, but also how the computer or device reads the texts. Machine-readable languages such as HTML and XML allow for further investigation of the content within the text themselves. The reader can search specific terms or tags within the text, like an inbuilt footer note system. Greatly moving away from the flat, PDF like format where no interaction or search functionality can take place. This evolution also gave a new home to the layering of multiple text editions and formats. No longer are we provided with the format chosen by the text digitizer, now we can chose from the many editions of that piece of text, where huge variations can be found.

Schreibman is cautious in her review of the digitising of the texts and believes that the means by which it happens can be subjective on the part of the participant conducting the exercise. It is as if we as human conducting the task of digitization, should in fact remove all human emotions or considerations and ourselves apply a mechanical almost robotic approach to the activity. We must strip back the action of digitizing to be a mechanical one, to allow for the deep human interaction, which will take place after the task is completed. Consideration must be given to after avoid any subjectification.

We can draw parallels with the creation of the web where a single formatting would allow many people to come together and share information and engage. Tim Berners-Lee knew the high importance of having a single language for all to communication. On the digitizing of texts, Schreibman knew that there needed to be a single format and approach for all digitizing to have one approach. Remove the human or individual approach and have a more mechanical one.

The Text Coding Initiative gives structure and guidelines to how the digitization should happen. Not only does it give one encoding format for all users to use, it also allows the work completed, to be further experienced by the bigger public arena through the many platforms that use this same formatting: Libraries, Museums, Publishers and Individual Contributors and Readers.

