libreoffice bug reporting (part 4)

The information required to create a bug assistant page is extracted from the advanced bug tracker query form and the wiki page describing the components using a mixture of XSLT and perl scripts. The BugReport_Details has been updated with class conforming to a newly defined libreoffice-bug microformat.

BugReport_Details constraints

The BugReport_Details is used as a data source using the following command line to ensure it is parseable:

curl --silent http://wiki.documentfoundation.org/BugReport_Details |
tidy --numeric-entities yes -asxhtml 2>/dev/null |
perl -pe 's|xmlns="http://www.w3.org/1999/xhtml"||'

It is composed using templates such as FdoSCS2 or BugzAssHlp_Chart. The level of nesting is decided based on what is more convenient for the person updating the wiki. It does not mater to the data extraction scripts because it only relies on the classes extracted in the page.

(06:46:31 PM) Rainer_Bielefeld: dachary: tell me if I can help in any way. For example, we will need Help texts for the Subcomponents. Some very few like are on the Wiki. I would like to have them on the wiki, currently I have a little preference for a new template for each Subcomponent similar to the tested ones for the Components. May be you create a MAILING Template...
(06:46:33 PM) Rainer_Bielefeld: ...due to your needs and I will create more for the resting Subcomponents until Tuesday?
(06:55:12 PM) dachary: Rainer_Bielefeld: I'm not sure that I understand your last question. Would you be so kind as to rephrase it for me ?
(06:57:26 PM) dachary: Rainer_Bielefeld: in http://wiki.documentfoundation.org/BugReport_Details could you also explain why it is more convenient for you to have the component details in a separate page such as http://wiki.documentfoundation.org/Template:BugzAssHlp_Chart instead of how it was done before ?
(06:58:51 PM) dachary: Rainer_Bielefeld: as far as I'm concerned that makes no difference because the scripts I wrote get the content from http://wiki.documentfoundation.org/BugReport_Details and do not see what is a template or not.
(07:57:10 PM) Rainer_Bielefeld: dachary: The Components can be read from ttp://wiki.documentfoundation.org/BugReport_Details as you do, also with minimum Explication from that page. I prefer Templates for each subcomponent help text because those text can be used more flexible in the wiki. But if it eases your work it's no problem at all to create a page containing all subcomponent texts so that you can read the help texts...
(07:57:12 PM) Rainer_Bielefeld: ...for the subcomponents from 1 single wiki page

composing the bug assistant form

The data extracted from the sources as explained below are composed using the bug.xhtml page. Each dataset is in a separate XHTML file that is included as follows:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
[
<!ENTITY component_comments SYSTEM "component_comments.xhtml">
<!ENTITY components SYSTEM "components.xhtml">
<!ENTITY subcomponents SYSTEM "subcomponents.xhtml">
<!ENTITY versions SYSTEM "versions.xhtml">
]
>

and used in the body of the XHTML file as follows:

&components;

The command xsltproc is used to process the result in UTF-8 and without validation to speed up the process:

xsltproc --encoding UTF-8 --novalidate ...

The templates required for the processing have been discussed to find the simplest pattern.
An identity transform is used when something must be copied. It would be simpler to use copy-of but that would not allow for further transformation of the tree being copied. Such transformation is required for instance to remove the FAQ link from the comment and only keep the sarch link.
When a xsl:template matches an element, it must contain instructions about how to handle the element and its descendants. If it contains a xsl:apply-template element, the templates will be recursively applied to the elements designated by the select attribute.

component comments

The comments explaining each component are extracted from BugReport_Details using tha component_comments.xsl stylesheet. It matches the component class
(match="div[contains(@class,'component')"), get its name from the first component (xsl:value-of select="*[position()=1]"), copies the rest (
xsl:apply-templates select="*[position()>1]"
) and kill the faq and submit links (xsl:template match="//*[contains(@class,'faq') or contains(@class,'submit')]"/).

subcomponents

The subcomponents select elements are extracted from BugReport_Details using the subcomponents.xsl stylesheet. There is no identity transform because the structure of the tree to be copied is already known. The stylesheet first matches the component as above, builds a select element using the name of the component and recurses on the descendants matching the search class (descendant::*[contains(@class,'search')]) to build each option element.

bugzilla query form

The advanced bugzilla query form is used to get the list of components and versions. Because it can’t be enriched easily with a microformat, a perl script has been written.