failure to fix a webkit bug

A webkit bug was selected as a candidate for a first proposed patch. It turns out to be a deep problem being progressively fixed since 2009 and unfit for a first contribution.

getting ready

Following instructions

svn checkout http://svn.webkit.org/repository/webkit/trunk WebKit

Pull build dependencies for the webkit debian package to get most of what’s needed and add the rest, as instructed each time the configuration run by build-webkit fails (below).

apt-get build-dep webkit
apt-get install libgail-3-dev  libgeoclue-dev

Build for GTK+ in debug mode

WebKit/Tools/Scripts/build-webkit --minimal --gtk --debug

It fails with

  CXX    Source/WebCore/bindings/gobject/libwebkitgtk_3_0_la-WebKitDOMCustom.lo
  CXX    Source/WebCore/bindings/gobject/libwebkitgtk_3_0_la-WebKitDOMEventTarget.lo
  CXX    Source/WebCore/bindings/gobject/libwebkitgtk_3_0_la-WebKitDOMObject.lo
  CXX    Source/WebCore/bindings/gobject/libwebkitgtk_3_0_la-WebKitHTMLElementWrapperFactory.lo
../../Source/WebCore/bindings/gobject/WebKitDOMCustom.cpp: In function ‘WebKitDOMBlob* webkit_dom_blob_slice(WebKitDOMBlob*, gint64, gint64, const gchar*)’:
../../Source/WebCore/bindings/gobject/WebKitDOMCustom.cpp:49:71: error: ‘webkit_dom_blob_webkit_slice’ was not declared in this scope
../../Source/WebCore/bindings/gobject/WebKitDOMCustom.cpp:50:1: warning: control reaches end of non-void function [-Wreturn-type]
make[1]: *** [Source/WebCore/bindings/gobject/libwebkitgtk_3_0_la-WebKitDOMCustom.lo] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory `/media/tmp/webkit/WebKit/WebKitBuild/Debug'
make: *** [all] Error 2
rm DerivedSources/WebCore/JSSQLStatementErrorCallback.h

It appears that the –minimal option is not well supported for gtk:

(11:09:02 AM) philn-tp: dachary: the minimal gtk build is fragile because not tested in the bots

another built is run without –minimal

WebKit/Tools/Scripts/build-webkit --gtk --debug

It requires over 8GB or RAM and about 15GB of disk space. It has been reported to compile with less than 8GB when enough swap space is available:

(09:55:45 PM) rakuco: I built webkitgtk debug on less than 4G a few days ago
(09:57:02 PM) acidx: I built webkitefl debug with 512MB not so long ago -- took me some 16 hours to link after I've added some swap space :P
(09:58:39 PM) dachary: with 16GB of RAM it's no longer crashing on ar cru .libs/libWebCore.a Source/WebCore/accessibility/.libs/libWebCore_la-Ac etc.
(10:02:10 PM) dachary: rakuco: how much swap space do you have ? I'm building gtk maybe it's heavier that what you've built ?
(10:05:53 PM) rakuco: dachary: ~9G in that test machine I used

Running the built binary requires the same options as build-webkit:

(12:35:55 PM) dachary: Hi, When I run Tools/Scripts/run-launcher it says: Unsupported platform, can't determine built library locations. Is there any way to know why this happens ? I'm happy to install packages to resolve this. But since the error message does not say anything regarding the failed assertions that leads to conclude my platform is not recognized it's fairly difficult for me to improve the situation.
(12:37:37 PM) ***dachary reading Tools/Scripts/webkitdirs.pm
(12:37:54 PM) philn-tp: run-launcher --gtk for instance
Philip` philn-tp
(12:38:21 PM) dachary: philn-tp: thanks for the hint :-)
(12:38:23 PM) dachary: Tools/Scripts/run-launcher --gtk
(12:38:23 PM) dachary: Can't find built framework at "NotFound".
(12:38:54 PM) philn-tp: did you build it?
(12:39:16 PM) dachary: yes
(12:39:29 PM) dachary: I added more information by editing Tools/Scripts/webkitdirs.pm
(12:39:35 PM) dachary: root@webkit-server:~/WebKit# Tools/Scripts/run-launcher --gtk
(12:39:35 PM) dachary: Can't find built framework JavaScriptCore at "NotFound".
(12:39:53 PM) dachary: not a full path name yet, but a little less obscure ;-)
(12:40:07 PM) dachary: I built with Tools/Scripts/build-webkit --gtk --debug
(12:40:28 PM) philn-tp: so by default the binaries should be in WebKitBuild/Debug
(12:40:49 PM) dachary: Tools/Scripts/run-launcher --gtk --debug
(12:40:54 PM) dachary: seems happy
(12:40:56 PM) dachary: and works !
(12:41:29 PM) dachary: the key is to understand that the same options apply to the run-launcher and build-webkit which is not obvious from the --help
Philip` philn-tp
(12:41:38 PM) dachary: philn-tp: thanks for being with me on this one ;-)

Running the tests for the webkit GTK launcher requires to follow specific instructions.

(12:46:11 PM) dachary: time Tools/Scripts/run-webkit-tests --gtk --debug
(12:46:11 PM) dachary: No httpd found. Cannot run http tests.
(12:46:26 PM) Zoltan: --no-http
(12:46:34 PM) Zoltan: and it will skip http tests
(12:47:05 PM) dachary: Zoltan thanks. When I do that are the tests supposed to complete real fast ?
(12:47:28 PM) Zoltan: negative
(12:48:07 PM) Zoltan: the http tests are only a small part
(12:48:24 PM) Zoltan: there are more than 22000tests in the trunk
(12:49:33 PM) dachary: Zoltan this is what I get when I add --no-http
root@webkit-server:~/WebKit# time Tools/Scripts/run-webkit-tests --gtk --debug --no-http
Running new-run-webkit-tests with one child process.
For more parallelism, run new-run-webkit-tests directly.
worker/0 raised OSError('[Errno 2] No such file or directory'):
  layout_tests/controllers/worker.py:91 (in run)
    self._worker_connection.run_message_loop()
  layout_tests/controllers/message_broker.py:191 (in run_message_loop)
    self._broker.run_message_loop(self._run_topic, self._client, delay_secs)
  layout_tests/controllers/message_broker.py:127 (in run_message_loop)
    self._run_loop(topic_name, client, block=True, delay_secs=delay_secs)
  layout_tests/controllers/message_broker.py:141 (in _run_loop)
    self._dispatch_message(msg, client)
  layout_tests/controllers/message_broker.py:150 (in _dispatch_message)
    message_handler(message.src, *optargs)
  layout_tests/controllers/worker.py:111 (in handle_test_list)
    self._run_test(test_input)
  layout_tests/controllers/worker.py:126 (in _run_test)
    result = self.run_test_with_timeout(test_input, test_timeout_sec)
  layout_tests/controllers/worker.py:165 (in run_test_with_timeout)
    return self._run_test_in_this_thread(test_input)
  layout_tests/controllers/worker.py:249 (in _run_test_in_this_thread)
    self._driver.start()
  layout_tests/port/gtk.py:48 (in start)
    self._xvfb_process = subprocess.Popen(run_xvfb, stderr=devnull)
  /usr/lib/python2.6/subprocess.py:623 (in __init__)
    errread, errwrite)
  /usr/lib/python2.6/subprocess.py:1141 (in _execute_child)
    raise child_exception

real    0m6.190s
user    0m5.432s
sys     0m0.696s

(12:50:14 PM) dachary: at least I know my build is ok because I can run the launcher
(12:50:23 PM) philn-tp: dachary: is it a svn checkout ?
(12:50:31 PM) dachary: philn-tp: yes, from yesterday
(12:51:33 PM) philn-tp: dachary: ah you need to install xvfb
(12:51:47 PM) philn-tp: dachary: see also https://trac.webkit.org/wiki/WebKitGtkLayoutTests
(12:52:19 PM) dachary: philn-tp: thanks. I would have had a hard time figuring this one out without your help :-)
(12:52:41 PM) dachary: I'll be back after lunch.
(12:53:18 PM) dachary: time Tools/Scripts/run-webkit-tests --gtk --debug --verbose --no-http
(12:53:26 PM) dachary: is running fine after xvfb is installed
(12:53:44 PM) dachary: Let see if it completes before I'm back ;-)

It fails to run on a non root user because it blocks on acquiring the http lock for an unknown reason. Instead it is run without http as follows:

time Tools/Scripts/run-webkit-tests --gtk --debug --no-http

There are many failures but it was decided not to insist because some tests can be run individually and run fine.

looking for a bug to fix

Created a bugzilla account.
Browse EasyFix bugs as recommended on irc.freenode.net#webkit:

(08:55:35 AM) rniwa: dachary: have you tried EasyFix keyword?
(08:57:00 AM) rniwa: dachary: also, I suggest you look at new bugs
(08:57:04 AM) rniwa: dachary: old bugs tend to be harder
(08:57:22 AM) rniwa: dachary: because easy bugs tend to be fixed quicker

Unfortunately all the easyhacks are taken.

Looking for NEW bugs that have noone assigned:

The No support of DOM events on a frameless document was picked because it contained a simple test case, was apparently left unattended for a few months and looked like a simple fix.

writing a test

(09:21:33 AM) dachary: Hi, I'm creating a test that demonstrate the behavior described at https://bugs.webkit.org/show_bug.cgi?id=26147 . It looks like it belongs to LayoutTests/dom/xhtml/level2/events. But since all tests there come from http://www.w3.org/2001/DOM-Test-Sui(09:24:42 AM) ***dachary reading http://www.webkit.org/quality/testwriting.html
(09:33:51 AM) dachary: it looks like LayoutTests/fast/events is a better place for a https://bugs.webkit.org/show_bug.cgi?id=26147 test
(09:36:12 AM) ***dachary reading more from http://trac.webkit.org/wiki/Writing%20Layout%20Tests%20for%20DumpRenderTree
(09:37:22 AM) Zoltan: dachary: it's good to see your activity
(10:08:43 AM) dachary: Zoltan because I'm lost most of the time ? :-) I wrote a test and try to run it with Tools/Scripts/run-webkit-tests --gtk --debug LayoutTests/fast/events/DOMParse-addEventListener.html but I get errors, probably because I misunderstand what the "console" div is for
(10:12:41 AM) philn-tp: dachary: Tools/Scripts/run-webkit-tests --gtk --debug fast/events/DOMParse-addEventListener.html
(10:16:38 AM) dachary: philn-tp: thanks, I did not realize I could omit the leadking "LayoutTests/" directory part.
(10:16:07 AM) dachary: I got it now.  The test ( https://bug-26147-attachments.webkit.org/attachment.cgi?id=109264 ) demonstrating https://bugs.webkit.org/show_bug.cgi?id=26147 fails as it should . I'm just not sure if I placed it where I should ( LayoutTests/fast/events/DOMParse-addEventListener.html ).

tracing the problem

Following the Running Tests in GDB instructions and stripping the -v:

root@webkit-server:~/WebKit# gdb WebKitBuild/Debug/Programs/DumpRenderTree
(gdb) run LayoutTests/fast/events/DOMParse-addEventListener.html
Starting program: /root/WebKit/WebKitBuild/Debug/Programs/DumpRenderTree LayoutTests/fast/events/DOMParse-addEventListener.html
[Thread debugging using libthread_db enabled]
[New Thread 0x7fffa81df700 (LWP 18720)]
This tests is for https://bugs.webkit.org/show_bug.cgi?id=26147 and shows that in a document loaded with DOMParser, events are propagated.

On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".


FAIL fired should be true. Was false.
PASS successfullyParsed is true

TEST COMPLETE

#EOF
LEAK: 3 CachedResource
LEAK: 57 WebCoreNode
[Inferior 1 (process 18717) exited normally]
(gdb)

locating the problem

In order to figure out where the problem happens, the execution is run step by step under gdb, comparing what happens with :

document.dispatchEvent(evt);

where document is the document being displayed and

doc.dispatchEvent(evt);

where doc is the result of

var doc = parser.parseFromString(thatDoc, 'text/xml');

It differs in

  void JSEventListener::handleEvent(ScriptExecutionContext* scriptExecutionContext, Event* event)
  {
...
    JSDOMGlobalObject* globalObject = toJSDOMGlobalObject(scriptExecutionContext, m_isolatedWorld.get());
      if (!globalObject)
          return;
...
}

where globalObject is null for the parsed XML document and the handler is therefore not fired. The reason why it is null is because the scriptExecutionContext->frame() function returns null. It matches the diagnostic of the bug report claiming that the problem does not exist when the XML document is loaded using an iframe.

researching solutions

Ask for advices on the webkit-dev mailing list.
Read MessagePort messages are dispatched to documents that are not fully active to understand one of the sanity check on frames .
Read Safari crashed when trying to login using the windows live contact control to understand the reason for the return at

   JSDOMGlobalObject* globalObject = this->globalObject();
    // Null check as clearGlobalObject() can clear this and we still get called back by
    // xmlhttprequest objects. See http://bugs.webkit.org/show_bug.cgi?id=13275
    // FIXME: Is this check still necessary? Requests are supposed to be stopped before clearGlobalObject() is called.
    if (!globalObject)
        return;

It may be worth returning false so that the caller can rely on this assertion to know the events will be ignored.
The jsDOMParserPrototypeFunctionParseFromString function from WebKitBuild/Debug/DerivedSources/WebCore/JSDOMParser.cpp which calls DOMParser::parseFromString from Source/WebCore/xml/DOMParser.cpp which calls DOMImplementation::createDocument with a null Frame* argument, is generated by generate-bindings.pl from Source/WebCore/xml/DOMParser.idl . If this function was given the Frame of the enclosing document, it could assign it to the parsed document.
A possible hack would be to wrap the parsed string in a hidden iframe

(12:09:37 PM) ***dachary trying to check if https://bugs.webkit.org/show_bug.cgi?id=26147 also happens for a svg document
(12:11:30 PM) WildFox: dachary: for sure :-)
(12:12:11 PM) WildFox: dachary: we're reusing exactly the same js listeners
(12:14:37 PM) dachary: WildFox: Thanks for saving me the trouble of checking :-) I was hoping to find a clue to fix the problem. In JSEventListener::handleEvent the JSDOMGlobalObject is retrieved from the Frame* . But there are no Frame* when you parseFromString and I have no clue what SDOMGlobalObject should be used instead.
(12:16:51 PM) dachary: reading the svn history of this function, it looks like it was never designed / modified to allow handling of events for Documents* that are created from parseFromString.
(12:17:48 PM) ***dachary looking for a predicate in Document.h that would return true if the document is created from parseFromString
(12:19:02 PM) WildFox: dachary: right, it wasn't designed for this use
(12:19:42 PM) WildFox: dachary: you need access to the ExecState in order to call the handleEvent function
(12:19:48 PM) WildFox: dachary: obtained from the global object
(12:20:11 PM) WildFox: dachary: is the scriptExecutionContext null already or just the global object returned by toJSDOMGlobalObject ?
(12:20:22 PM) WildFox: ah sorry, I missed the assertion above, it's always non-null
(12:20:25 PM) dachary: scriptExecutionContext is the Document
(12:20:43 PM) dachary: only toJSDOMGlobalObject returns null because Document->frame() returns null
(12:22:21 PM) WildFox: dachary: I'm too unfamiliar with the general design, but maybe one could use an "empty hidden frame" when creating the document
(12:25:56 PM) dachary: WildFox: do you figure that solution would be accepted ? I mean, forcing the creation of a frame.
(12:26:13 PM) anttik: why can't it be the same tree, painted with different transformation?
(12:26:15 PM) dachary: WildFox: it looks like a hack but I'm not sure.

failure

related bug was found and the reporter Eric Seidel apparently worked on patches to improve the situation back in 2009, most of which have been implemented.
The extent of the changes and the level of expertise necessary to implement them partly demonstrates that fixing the chosen bug properly cannot be achieved as a first time contribution to webkit.