Friday, September 16, 2011

Elvis Costello and History

Over the years, I have told many RTEMS users that I provide hosting and system administration for an Elvis Costello fan site (PHPBB Forum and Wiki).  I have had the fan site for about 9 years now.  What many of you probably don't realize is that I have also hosted the Mirror Site since 2006.

This is a personal effort and I receive no subsidy from OAR for doing this.  In order to have a static IP address and host services, I have to have a business class account which is more than a residential class account.  All of the sites I host plus the family Internet activities share a 7Mbps/1Mbps connection.

I had been helping on the technical side of administrating the Elvis Costello Fan Forum for a while when the hosting service got hacked and our site trashed.  We were unable to get anyone to contact us via email or phone for over a week.  I realized that I had a computer running GNU/Linux Fedora that was largely unused since I had upgraded.  Even though it was only a 350 Mhz Pentium II with 384MB RAM, it was perfectly suitable to host a small (~100K hits a day) website.  I made a phone call to get a static IP address, moved the domains and within about a week we were back up.

A couple of years after that, the person who ran the Elvis Costello Wiki asked if I could host it.  I had already planned to upgrade to a 2.4 Ghz Pentium 4 with 2GB RAM.  We decided to wait to move the Wiki until after the server upgrade. The hardware upgrade went easily and we moved the Wiki.  What surprised us both was that the performance on the site went to hell.  The server logs showed nothing, load looked low and no amount of tuning or probing helped. I begged my ISP for help and got an unlocked, uncapped cable modem to test with.  After more research and fighting, I learned that my router could not hand the number of simultaneous connections and was dropping them randomly.  I upgraded routers and the performance issues were settled.

The site has always used external hard disks for backup.  There is a script which runs every night and dumps all user directories, databases, etc to a special directory on an internal disk.  Then another which runs later and "rsync's" the internal disk copy with an external one.  Backups are placed in dated directories and a few a month are saved.

When I set up the Mirror, I was more concerned with disk space than bandwidth consumption.  I don't have the fastest connection and I am sure users would appreciate a faster uplink. But I foot the bill and until there is funding, this is what there is.  The site mirrors at least 4 times a day.  Ralf Corsepius has set up automated checks which let us know when a mirror site is down or out of sync.

In late 2010, I became worried that the 2.4 Ghz Pentium 4 was getting very old.  It was not new when it became the server and I was sure it had seen at least 6 years as server.  The Elvis Costello Fan community rallied around my request for a new server and within a month or so, the fund raising goal was met.  The new server is from AS Labs who specialize in building custom GNU/Linux systems.  It has a quad-core 3.0 Ghz CPU and 8 GB RAM.  It runs very cool and is far from overloaded.

Over the years we have had period power outages with the worst being the tornadoes of April 2011. But overall, I believe our uptime is very good.  I don't track it but thanks to the Elvis Costello fan community, it can't be down over 20 minutes without me getting an email. Thanks folks!

My wife and I have learned a lot about system administration over the years of maintaining these sites.  She personally reviews and approves every account request for the PHPBB Fan Forum.  The number of spam account requests is boggling and periodically she begs me to try to find another way to stem an increase.  The Wiki account requests and spam were solved when we instituted a very strict policy on getting an account.  A small group of people review and approve these accounts.

The server also hosts a couple of very low volume sites for friends.  They are more interesting from a content viewpoint and I want to share more about them in a future post.

Thursday, September 8, 2011

RTEMS Pair Programming

One of the most interesting and under-utilized RTEMS services that OAR Corporation offers is RTEMS Pair Programming.  This service is a great solution when dealing with a customer who wants to a big head start on some type of development effort. In Agile terms, this is a development sprint with a team consisting of RTEMS and customer supplied experts.  Most of the time, we do this for BSPs and device drivers.

Left to Right: Walter Nakano, Wendell Pereira da Silva,
Joel Sherrill and Jennifer Averett
The key to pair programming success is that we know RTEMS and the customer knows their hardware and test equipment.  OAR folks can concentrate on quickly providing the framework for the BSP and needed devices drivers. Then we work together to author the device drivers. This provides them with specialized training on the details of the BSPs and device drivers that are critical to the success of their application.  Usually the initial testing is performed as joint effort with subsequent detailed testing performed by the customer engineers.

Recently, OAR got to host Wendell Pereira da Silva and Walter Nakano from COMPSIS  for two weeks of intense development activity.  Their system consisted of an embedded PC plus some add-on boards which added up to a lot of individual pins and ports to test.  They brought a LabView test right which allowed us to test every input and output on the Multi-I/O board.The hardware list was:
  • RTD CME137686LX Embedded PC
    • 4 COM ports and i82551 NIC of particular interest to their project
  • RTD 17320HR Octal UART PC-104 board (PCI interface)
    • Exar PCI Vendor Id with 8 NS16550 compatible serial ports
    • NOTE: We only had one of these boards but they will have 4 in the real configuration!!
  • RTD 316HR Dual Synchronous Serial Port PC-104 board
    • single Zilog Z85230
  • RTD 6425HR Multi-I/O PC-104 board
    • 16 differential or 32 single-ended analog input channels
    • 4 analog output channels
    • 32-bit discrete I/O with 16 bit programmable for interrupt on input change
One thing should quickly stand out with viewing that hardware list.  They have a LOT of serial ports.  Four asynchronous on the embedded PC, 32 on 4 PCI-104 boards, and 2 synchronous on the Z85230 board for a total of 38 serial ports.  This is actually a classic example of a case where using the libchip serial driver framework would be very useful.  But the PC386 console driver was not designed this way.  Before the guys arrived, Jennifer reworked this driver to be libchip style.  At the same time, I factored out the mouse input stream parsing code.  It really wasn't BSP dependent and by moving it to cpukit/libmisc/mouse, I made it potentially available to every BSP.  Jennifer and I had tested COM1 and COM2 on qemu before they arrived but waited for their hardware to test COM3 and COM4.

COM1 and COM2 worked as soon as the cabling was correct. COM3 and COM4 proved more difficult.  After struggling to find a software problem, it occurred to me that it could be as simple as RTS/CTS not being wired together in the shell since we were using a 3-wire connection.  That was indeed the problem.  Next came the octal serial port board.

After realizing that we would end up with a libchip configuration table with 38 entries and most of them would be disabled for "normal" configurations, I had the idea to allow for dynamic registration of new "ports" in the libchip configuration table.   The idea is that if you probe for a bank of 8 serial ports and find them, then you can dynamically add 8 more entries to the libchip configuration table.  Currently this allows probes to insert entries prior to console_initialize() being called.  It is possible to allow them to be registered after this point but they would not be available to be /dev/console or used for printk().

While I was implementing dynamic registration, Jennifer and our guests worked to get the PCI probe to find the card and the first serial port working.  We were surprised to learn that it didn't have a vendor Id of RTD but Exar.  This explained the sparse programming documentation from RTD.  As soon as the probe and one serial port worked, we switched to my dynamic registration code.  Soon all eight ports on the board we had were working.  Plus I added code to detect the 2, 4, and 8 port variants of the Exar chip.

Next was the dual port synchronous board.  Unfortunately, RTEMS does not have a Z8530 synchronous driver but does have a standard libchip asynchronous driver.  After fiddling to figure out the baud rate clock divisor math, we ended up with both ports working.  There was one issue in the driver we did not resolve in the two weeks they were here.  The two ports on a Z8530 share a single interrupt status register which when read, clears the source.  You have to be extremely careful to touch it one time and process all interrupt sources on both ports.  The ports worked individually but not when both were installed.  Jennifer and I had a solution but not enough time to implement it. Hopefully Wendell and Walter can implement it and we can get this resolved in the main tree.

Next was the Multi-IO board.  If you have been following my blog a while, then you might remember the entry RTEMS Shell as Debug Aid where I discussed adding commands to the RTEMS Shell to aid in debugging a Winsystems Multi-IO board similar in capability to this board.   One of the last things Jennifer and I had done to the existing multiio was to define a board independent interface between the shell commands and the actual driver.  My plan was to let this interface evolve and grow as we learned more about user application requirements.  This was the first opportunity we had to write a driver to this interface and reuse the commands.  As might be expected, there were places where 0/1 based numbering of inputs still reflected the Winsystems board.  And there were places in the RTD documentation that were unclear.  But after a while of fighting these and the normal cabling issues, we were able to use the existing commands to debug the driver and verify that all discrete I/Os to work polled and interrupt driven and that all analog inputs and outputs work polled.  We ran out of time before we were able to attempt analog input interrupts.

The final thing we attempted was getting the RTEMS TCP/IP stack to run on this board.  It had an Intel i82551ER NIC which required using the drivers in the libbsdport kit of late model FreeBSD drivers.  This driver works on qemu when you configure qemu for the i82559 simulation.  We verified the basics were OK on qemu.  Then we moved on to the real hardware.  After the normal hunt for an extra cable and battle of the network settings, we were able to run the telnetd application from the network-demos module.

Walter and Wendell drove their hardware.  Jennifer and I were the main forces driving the code but they reviewed every line of code and we all verified that each line of code programmed the hardware as we all agreed it should be.  Along the way, if something was unclear, we took a break from coding and testing to focus on a portion of the RTEMS Open Class that was very specific to what we were working on. The goal was not only to have as much functional code as possible; it was also to ensure that the code was high quality and they left understanding it and capable of modifying it should the need arise.

At the end of two weeks, we all were thrilled.  Walter and Wendell had been sending home progress reports and every day I continued to be amazed at the progress we -- as a team -- had made.  This amount of progress was possible because each of us brought unique skills and knowledge to the table.  Jennifer and I knew where to reuse code from in RTEMS and how to create an elegant solution the RTEMS way.  Walter and Wendell were intimately familiar with their hardware and test equipment and ensured we tested well.  Together we reviewed all code to ensure they left understanding it.

I really enjoy teaching the RTEMS Classes but RTEMS Pair Programming is one of the most fun and personally rewarding services we offer.  I always come away amazed at how much is working at the end of an intense 2-3 week development sprint.  By bringing together engineers with complementary skills and knowledge, solutions are found quicker.  And solutions are ultimately what we all want.

Thanks to Walter, Wendell and Daniel who couldn't make the trip for two weeks of fun and productive work.