First, we will open the file, read a line, convert a line from this format:
112|1|Say, "He is Allah , [who is] One, 112|2|Allah , the Eternal Refuge. 112|3|He neither begets nor is born, 112|4|Nor is there to Him any equivalent."to this XML format:
<?xml version="1.0" encoding="utf-8"?> <quran> <sura no="112"> <verse no="1">Say, "He is Allah , [who is] One,</verse> <verse no="2">Allah , the Eternal Refuge.</verse> <verse no="3">He neither begets nor is born,</verse> <verse no="4">Nor is there to Him any equivalent."</verse> </sura> </quran>We will chop this bin single Quran file into 114 files each file contains a surah. We will place all these files in a premade folder called
qxmlen
.Here is the code.
1 #!/usr/bin/perl 2 open(FH, 'saheeh.tab') || die "can't open file..\n"; 3 $oldsura="1"; 4 open(OUT,'>qxmlen/1.xml'); 5 print OUT "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n"; 6 print OUT "<quran>\n"; 7 print OUT "<sura no=\"1\">\n"; 8 while($line=<FH>){ 9 ($sura,$v,$aya) = split('\|', $line); 10 if ($oldsura ne $sura){ 11 print OUT "</sura>\n</quran>\n"; 12 close (OUT); 13 open(OUT,">qxmlen/$sura.xml"); 14 print OUT "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n"; 15 print OUT "<quran>\n"; 16 print OUT "<sura no=\"$sura\">\n"; 17 $oldsura = $sura;} 18 chomp($aya); 19 print OUT "<verse no=\"$v\">"; 20 print OUT "$aya"; 21 print OUT "</verse>\n"; 22 } 23 print OUT "</sura>\n</quran>"; 24 close (OUT); 25 close(FH);Line 4: we are creating a file for writing, indicated by the '>' sign.
$line
.$sura, $v, $aya
.In this way we have created 114 XML representation of surahs. You can see them in the folder: http://www.textminingthequran.com/data/qxmlen/. I am not going to create another XML tutorial because the guys in W3Schools did wonderful tutorials. Here I am only going to show you how to present and format your XML output using CSS stylesheet.The details can be found in the W3School tutorial.
If you open a raw XML file, the presentation might not look appealing to human reader, like the picture below:
But now, I am including a CSS stylesheet within the XML file at line 2 below:
<?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type="text/css" href="quran.css"?> <quran> <sura no="112"> <verse no="1">Say, "He is Allah , [who is] One,</verse> <verse no="2">Allah , the Eternal Refuge.</verse> <verse no="3">He neither begets nor is born,</verse> <verse no="4">Nor is there to Him any equivalent."</verse> </sura> </quran>And this
quran.css
is as follows:
quran { background-color: #ffffff; width: 100%; } sura { margin-left: 10; color: #FF0000; } verse { display:block; color: #0000FF; font-size: 12pt; }Again you can go through a CSS tutorial in W3School. Now, the file looks better:
And, let us aim for more control over XML throgh some XSLT transformation, where we gain more control over XML elements and attributes. Following is the XSLT stylesheet store in a file Let us wrap up, so we learned reading and writing to files using perl. We created XML files, and saw how we can view these XML files using CSS and XSL stylesheets, benefitting from some of the tutorials at w3schools.
quran.xsl
, that presents verses in a tabular form. See the w3school tutorial.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>Quran::Sura No. <xsl:value-of select="quran/sura/@no"/></h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>No</th>
<th>Verse</th>
</tr>
<xsl:for-each select="quran/sura/verse">
<tr>
<td><xsl:value-of select="@no"/></td>
<td><xsl:value-of select="."/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
We include this XSL into the 111.xml file as follows:
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="quran.xsl"?>
<quran>
<sura no="111">
<verse no="1">May the hands of Abu Lahab be ruined, and ruined is he.</verse>
<verse no="2">His wealth will not avail him or that which he gained.</verse>
<verse no="3">He will [enter to] burn in a Fire of [blazing] flame</verse>
<verse no="4">And his wife [as well] - the carrier of firewood.</verse>
<verse no="5">Around her neck is a rope of [twisted] fiber.</verse>
</sura>
</quran>
And the result is as follows:
<<Hello World | Start | A Search program>>
tutorial@textminingthequran.com