Friday, December 11, 2009

So you want a new Talos suite, eh?

This quarter Alice and I have focused on trimming the list of pending test suites and where several new ones (419776, 524089, 515540, 506772) have been turned on in production.

The process for getting a new suite in has becoming a lot clearer, so we gave a presentation at the recent all-hands to help the developer know what to do on their end and what RelEng can do for them once their test suite is ready for staging.

Here's what a developer needs to do:
  • Download and install Standalone Talos to test their suite in
  • Once they have established that the test works on at least one platform, write a patch against talos in cvs
  • File a bug against RelEng in the General component and provide the following information:
    • Contact person who will work with us on getting the test suite enabled
    • What the test does, what the expected output should be, long name, short description
    • Which branches and platforms you want the test run on
RelEng will create the buildbot patches that enable the tests, insert the tests into graph server, and work with the contact person while the tests are in staging to make sure the expected outcome is reached. Once the tests run as expected we can turn them on in production. Perfect world turnaround for this process is about a week and a half and involves a short Talos downtime. The rest of the time allotted to our presentation was spent discussing where we should be setting our sights for Talos improvements. This looks to involve two relatively large undertakings:
  1. While it recently underwent some much needed improvements, the graph server still needs to be faster, more stable, scalable, and able to handle our ever-growing data sets. The blocker here is that no one really owns graph server and it's hard to know who should.
  2. Talos is barely holding up under the current load of tests, hardware, and infrastructure. It also works in such a way that a lot of manual involvement is required to add new tests. It would be awesome for it to work more like unittests where once individual tests are checked in, they would go into production immediately. It would then be possible for a developer to not only write a unittest for any bug fix, but also a performance test to go along with it.

Now this brings up the problem of what performance we want to measure and how we want to approach performance metrics in the long run. Alice made a great point when she stated that folks who are not trained and accustomed to doing QA might be challenged by trying to generate tests that actually create a good metric for the performance they wish to be testing. It's entirely possible to have tests that seem interesting on the surface, when you drill down, don't provide any useful data for actually improving anything.

Do we want per-bug performance tests as we do with unittests? While it looks like this is a way to make a developer more accountable for their code, it's pretty obvious that this model wouldn't scale well at all with our current hardware and turnaround expectancy. Imagine as many individual pageload tests as there are mochitests...I suspect no one wants to see that.

Performance testing would be better and more useful if it was targeted at specific features or areas of the product where someone is actually tracking the improvement/regression ranges on them as they are developed. That's a key area of Talos - that a human is actually accessing the data, finding it useful, and making improvements on their feature/area as a result of this information.

While brainstorming with Aki on the potential of the graph server data, one idea really got me excited. Open up the data.

There's been a lot of hype lately about opening up data. In February of this year Tim Berners-Lee encouraged us to start thinking about open, linked data and how it could be the next round in how the Web helps us re-frame our world. In Canada the city of Vancouver opened up its data in the hopes of "improving liveability and governance" in the Metro area.


What if the Talos graph data was made available to the community and a challenge was created in the spirit of the marketing design challenges where we ask people to help us find new ways to view the data? I'd be really curious to see what kind of visualizations would come out of the larger community. RelEng doesn't have a very large community outside of employees, so this could be a great way to start working on creating one.