Friday, May 18, 2012

Technical Debt and RTEMS

Dr. Dobb's recently had an interview with Ward Cunningham who developed the first wiki among other notable contributions. The interview is interesting and I recommend reading it. But I wanted to pass along some thought on one term that resonated with me.
Technical Debt: Cunningham uses this term to refer to work that needs to be done before a desired change can be implemented or work required to propagate a desired change across a codebase. In the RTEMS world, we have multiple examples of this. 
The most common case of RTEMS technical debt is when a single change must be implemented across all or a set of BSPs at the same time. Recently, Jennifer and I converted the MIPS port from Simple Vectored (SV) to Programmable Interrupt Controller (PIC) interrupt model. We did this because the MIPS/Malta we were developing a BSP for had a more complicated interrupt structure than previous MIPS boards. It was logical to use the PIC rather than the SV model on this BSP. But to ensure that all MIPS BSPs were consistent, we had to implement the same change for six other BSPs.

There are cases with RTEMS where some preparatory work must be done before something else is implemented. A prominent case of this was the CPU Scheduler Plugin Framework work by Gedare Bloom. Before the plugin framework existed, the RTEMS CPU Scheduler was bits and pieces of code embedded in the places where threads changed states and priorities. The plugin framework captured those decision points and made it possible to change the scheduling algorithm at application time configuration (e.g. confdefs.h in RTEMS terms). This was refactoring and clean up needed to be able to implement SMP support for RTEMS.

There are also cases where both types of debt must be paid within a single area. The RTEMS file system infrastructure has evolved over the past few years. Sometimes, there are desirable changes which must be propagated across all file system implementations and for other changes which have required cleaning up an area before refactoring or reworking it.

Not paying your technical debt can lead to long-term pain on a project. The most prominent case of this with RTEMS is when the PowerPC was converted some SV to PIC interrupt model. In contrast to what Jennifer and I did for the MIPS, this time only some BSPs were converted to the PIC model. This meant there were two different BSP interrupt models that co-existed for years. Worse, they were not named at the time and were known as "old" and "new" for years. And if that wasn't bad enough, the method and type names were not the same as the implementation for the x86. It has taken years of paying technical debt to clean up this mess.

It is also something that RTEMS developers would like to avoid repeating. So when you submit a patch and you are asked to modify some code you don't care about to make it consistent or told to be consistent with another BSP, remember that we are just asking you to avoid incurring technical debt. After all, it will probably be someone else who has to pay it. Better to avoid it altogether.

Tuesday, May 15, 2012

RTEMS Build System Ruminations

This post is a collection of ruminations after a recent post on the RTEMS Users mailing list. The post asked about a few  issues the user was having (italics for quotes):
  • my changes not "taking"
  • is there any way to limit what bootstrap operates on?  Since it takes quite a while to complete and most of the BSPs are of no interest to me, I would like to avoid bootstrapping them.
Gedare Bloom and Ralf Corsepius replied to the post and I thought it would be a good idea to take the time to write up some issues, workarounds, and benchmarks for various operations using the current build system on a git checkout with no modifications. Any build times quoted in this article were performed on a quad-core computer in the RTEMS Build Farm or my personal laptop with the following specifications:
  • RTEMS Build Farm Computer
    • Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
    • 4 GB RAM
    • Seagate 320 GB 7200RPM HDD (ST3320613AS)
    • Western Digital Caviar 1 TB 7200RPM HDD (WD1001FALS)
  • My Laptop
    • Intel(R) Core(TM)2 Duo CPU T7500  @ 2.20GHz
    • 4 GB RAM
    • Hitachi 160GB 7200RPM HDD (HTS722016K9A300)
The CPUs and disks in these computers are reasonably comparable in performance except that the build farm machine is quad-core. I would expect that the quad-core machine is somewhat faster in single core straight line computation than the clock speed indicates.But I would not expect a 2-3x speed difference in single core performance.

I will be the first to admit that neither of the above computers is the fastest one available today. The fact that one could spend money and get faster computers is important. These are reasonable computers and not obsolete. In fact, when I look at potential laptop upgrades, I am still surprised that my old laptop's CPU is rated much faster than those found in many available today. You have to move to a higher end laptop to beat that. When teaching RTEMS classes, I see attendees with computers that are both faster and much slower than mine. And performance is likely to be much worse in Cygwin or virtual machines than either of the two computers above.

The first issue was my changes not "taking". I personally always configure RTEMS with the --enable-maintainer-mode option which is documented as follows:
--enable-maintainer-mode  enable make rules and dependencies not useful
                          (and sometimes confusing) to the casual installer
This tends to ensure that any changes to configure.ac and Makefile.am files are taken into account in a build tree and the appropriate files regenerated. There are limits to this in that if you changed a compiler flag then it will not result in everything being recompiled. However, if you change the way in which a configure option is interpreted and that is propagated into cpuopts.h or bspopts.h, it should result in impacted files being recompiled.

But what if a .h file changes? Based upon my experiment adding a one line comment to confdefs.h, every test "init file" was recompiled and every test executable was relinked. This is as expected.

Now let's consider changing a C file in cpukit. As an experiment, I added a one line commit to cpukit/sapi/src/exinit.c. This file contains rtems_initialize_data_structures() and thus every RTEMS application is dependent on this object file. The library librtemscpu.a was properly updated but no test was relinked. This untracked dependency is one possible explanation for my changes not "taking".

What is a C file in the BSP changes? As another experiment, I added a one line comment to c/src/lib/libbsp/shared/bootcard.c. Just as in the cpukit experiment, this file is required by every RTEMS application. The file librtemsbsp.a was properly updated but no test was relinked. This untracked dependency is another possible explanation for my changes not "taking".

Gedare makes a point of stating that long-time RTEMS developers know these deficiencies and work around them. Personally, I often find something like this in my command history:
rm -f `find . -name "*.exe"` ; make >b.log 2>&1 ; echo $?
That command ensures that tests are relinked. It covers up the fact that the library dependency is not properly tracked.

The first part of the second issue was is there any way to limit what bootstrap operates on? The solution to this is to only run bootstrap from the lowest level directory that has a configure.ac that you modified or added. Often this is just a single BSP or when initially adding a BSP, the directory c/src/lib/libbsp/ just above your new BSP. Gedare Bloom answered this quite thoroughly and I am just going to quote his answer:
Except when you add new files / modify configure.ac/Makefile.am files you need to re-run bootstrap at the closest level to your modified Makefile.am/configure.ac that contains a configure.ac file. 
So for example if you add a .c file into say cpukit/score/src then the file needs to be added to cpukit/score/Makefile.am and then you need to re-run bootstrap from cpukit because that is the closest parent directory with a configure.ac in it. For BSPs usually you just have to deal with the libbsp/CPU/BSP directory.
In order to run bootstrap from there I use a shell variable that points to my RTEMS root directory ($r) so that I can just ... "cd cpukit ; $r/bootstrap"
The second part of the second issue -- Since it takes quite a while to complete and most of the BSPs are of no interest to me, I would like to avoid bootstrapping them. -- is more complicated. When you initially clone the RTEMS git repository, you have to bootstrap the entire tree. A full bootstrap takes a long time and appears to be very single threaded. On the build farm machine described above, this takes 5m18.331s of user time and 0m48.900s of system time for a total of about 6 minutes to complete.On my laptop, this took 7m56.394s of user time and 1m6.840s of system time for a total of about 9 minutes.  Having a quad-core CPU does not help. The bootstrap process has not significantly improved in time in years. I recall various computers used over the years for RTEMS developing taking from 5 to 12 minutes to execute a complete bootstrap. And this time is much longer on Cygwin due to the inefficiency of the way it must implement POSIX process forking on MS-Windows.

Another thing to note is that bootstrap -p ONLY has to be run when you have modified a Makefile.am and changed the set of header files it installs. This generates the preinstall.am files. It does not need to be run after cloning the RTEMS repository because the preinstall.am files are checked into git. Many people run it more than it needs to be run. 

The need to bootstrap and git branches do not get along as well as one would hope.  As Ralf Corsepius explains in the post in the thread:
One final advise: Do not switch git-branches in git checkouts. As git does not preserve timestamps, while make and the autotools are relying on timesstamps, this will break time-stamps on generated files and eventually result in havoc - You (need) a toplevel bootstrap with each "git branch checkout".
To avoid this, my advise is to use multiple checkouts instead.
I note that even with --enable-maintainer-mode enabled, my experience is that you do often get stuck bootstrapping from the top of the tree when switching branches. The builds will end with a cryptic message in the output. This is a serious hindrance to using git. The typical git usage pattern does not include having multiple clones for different purposes. This is what branches are designed for.

How long does it take to build RTEMS? The answer to this question depends on a lot of factors including the obvious like the computer you are using and the not so obvious such as how you configured RTEMS and did you use the -j option to make to enable parallel jobs.  If you configure RTEMS to include all of the tests, then the build time is significantly longer since there are 399 total tests to compile and link. If you enable only sample tests, then this number drops to 13. The execution of the configure command itself is not a huge factor in build times taking only about 4 seconds on my laptop. It is the actual make that takes so long. The make actually results in a lot of configuration being performed. On my laptop, I got the following times for configure and make when POSIX, TCP/IP, and all tests enabled for sparc/sis (forgive the bad line wrapping):
$ time ../rtems/configure --target=sparc-rtems4.11 --prefix=/home/joel/rtems-4.10-work/bsp-install/ --disable-multiprocessing --enable-cxx --disable-rdbg --enable-maintainer-mode --enable-tests --enable-networking --enable-posix --disable-deprecated --disable-ada --enable-expada --enable-rtemsbsp=sis >c.log 2>&1
real 0m12.511s
user 0m1.970s
sys 0m2.138s
$ time make -j3 >b.log 2>&1
real 10m1.806s
user 8m9.319s
sys 2m8.838s
Building all tests on the quad-core computer at -j7 resulted in a build time of approximately 5 minutes. Given the large number of tests, this indicates that there is opportunity to take advantage of multiple cores during a full build of RTEMS.

Building only the sample tests (e.g. --enable-tests=samples) on my laptop, resulted in a build time of 3m4.420s real time with system and user coming close to adding up to real time. That means 2/3 of the build time is compiling and linking the tests.

How much of the make time is actually configuration? After posting this, I was asked privately how much of the make is spent in configuring versus compiling. To answer this question, I found the first file in RTEMS compiled for the target (e.g. cpukit/score/cpu/sparc/cpu.c in this case) and introduced a compilation error. Then I manually fixed the compilation error and invoked make again. The second make invocation is likely verifying that the configuration didn't change so configuration overhead didn't go to zero but it is close enough.
$ time make -j3 >b1.log 2>&1
real 1m38.264s
user 1m8.027s
sys 0m13.357s
$ time make -j3 >b2.log 2>&1
real 1m29.903s
user 1m56.809s
sys 0m24.112s
I repeated this experiment on the quad-core build farm machine and got the following results:

$ time make -j7 >b1.log 2>&1
real 0m36.652s
user 0m10.918s
sys 0m10.383s
$ time make -j7 >b2.log 2>&1
real 0m50.618s
user 1m37.673s
sys 0m27.296s
Looking at the above, it is pretty clear the the configuration part of make is a significant portion of the entire build time. On my laptop it was slightly over half, while on the quad-core computer, it was about 40%. It  also appears the the configuration stage is unable to take advantage of multiple cores as user plus system time are less than the real time in both cases. On the quad-core, it took nearly three times more real time than CPU time which likely indicates that it is I/O bound. 

In contrast, the build portion of the make command's actions are clearly parallelizable. On the dual core laptop, the approximately 90 seconds spent in the second step used 120 seconds of CPU time. This indicates that there both cores were utilized for about 2/3 of the build time. On the quad-core machine, we see about 51 seconds of real time consuming 135 seconds of CPU time for about 2/3 utilization again. The build time was reduced about 45% by moving from from the dual core to the quad-core computer.


I am personally a proponent of continuous integration and testing. It would be a boon to the RTEMS Project if there were a buildbot and system to get build and test execution feedback on every commit/ Even better would be able to get this feedback before the patch is officially committed. Considering that building all source for one BSP with all tests takes 5 minutes on a reasonable quad-core computer and NO TESTS WERE RUN, one can see the challenge.There are approximately 145 BSPs in the tree currently when one considers variants. On this computer, it would take over 12 hours to build all BSPs and tests. This assumes a fresh checkout and a single bootstrap. If you did that for each BSP, the build would take over 24 hours without executing any tests. Add in a test build of each target in multilib configuration and documentation, and that time goes up even further. That is in a SINGLE CONFIGURATION -- this does not include verifying that BSPs build with and without networking or POSIX enabled. 

This is completely unacceptable for a continuous integration and test effort. According to via.ca, we have had an average of 2.34 hours between commits since moving to git. No single solution will allow us to have a fast enough turn around on build and testing. In order to achieve a turn around under 2.34 hours, we will have to address the speed of the bootstrap, speed of the build process, distribution of building and testing, be smart about only building and testing areas impacted, and ultimately throw more hardware at the problem.

As final food for thought, this is just for RTEMS itself. This does not account for the testing that should be done on the GNU tools we rely upon (e.g. binutils, gcc, gdb, and newlib). A full build and test cycle for all targets  can take up to 4 days on the same quad-core computer. This time can vary based upon the languages being built and tested but GCC simply has a lot of tests. 

Monday, May 14, 2012

Using sed to Remove CVS Ids from RTEMS

Over the Christmas break, the RTEMS Project converted from CVS to git. We have all made mistakes as we transitioned to git and will admit to still be learning. But our workflow is improving and we are making fewer mistakes. However there are still a number of outstanding tasks left over from the transition:

  • Remove CVS $Id$ Strings
  • Move from ChangeLog files and define new "history" file which lists major changes
  • Convert release procedure to git
Since it has been well over four months since we initiated the conversion to git, I decided it was time to push at removing the Ids. Being comfortable with bash and sed, I decided to see how many of these I could remove without editing a file by hand.  This turned out to be harder than I expected.

First, there are a lot of types of files in RTEMS and comment structure varies accordingly. There are C-style comment blocks, C++ one-line comments, "# to end of line", "; to end of line", etc.
Second, even within a single language, there were differing CVS Id string formats.  For example, I expected most Id strings in C code to be at the end of a comment block like this:


 *
 *  $Id$
 */



But sometimes a file would have something like this:


 *  end of a comment paragraph
 *
 *  $Id$
 *
 */



In the above case, removing the $Id$ line and the preceding line would leave an undesired blank line at the end of the comment block.  I implemented this as a set of sed transformations. They were applied in an order which would match the longest sequence I had identified followed by others which matched shorter sequences. On top of that, there were limits on how many transformations could occur inside a single invocation of sed. It was easier conceptually to take the output of a single set of sed transformations and then apply another set of transformations.  The following sed command file dealt with removing the $Id$ and the preceding comment line:

# Remove CVS Ids which are more or less like this (embedded C, preceding)
# ^ *
# ^ *  $Id
/^ \* *$/{
 N
 #
 /^.*\n \* *\$Id.*/d
 /^.*\n \* *@(#) \$Id.*/d
}
Note that sometimes the code had " * @(#) $Id$" and this single sed command file would properly remove both patterns.

The solution turned out to be a set of sed command files and a shell script. The shell script invoked sed multiple times in a pipeline. Each stage in the pipeline performed a different transformation and that was fed into another invocation of sed with a different command file.

In the final step, if the first line of the file had ended up as a blank line because of the transformations, I removed it with this sed command file:


# Remove the first line when it is blank
1{
 /^$/d
}

In the end, I ended up with 28 sed transformations in an 8 stage pipeline. The scripted edits turned out to be a 2.7 megabyte diff which modified 6376 files. I was left with 117 files to edit by hand and only one file mishandled by the script. The changes can be viewed at http://git.rtems.org:
I hope it is understood that I compiled a lot during this effort. I wanted to make sure that every file I had touched was compiled. In doing this, I learned that a number of odd configurations in RTEMS had not been built recently and were, in fact, broken before I started.

If anyone is interested, I can make the shell script and set of sed scripts available but I must disclaim that they were not written to be classroom sed examples. They are understandable but probably not optimal nor perfect. They were written for a one-time massive edit and will only be reused to remove the CVS Id strings from other modules owned by RTEMS.

This could certainly have been done with other tools including awk and Python. But I knew sed and bash and using what you know is often the most effective way to do the job.