by PaulL » Mon Jul 05, 2010 10:26 pm
by PaulL
Mon Jul 05, 2010 10:26 pm
A progress update:
I have made significant progress in working on the Register classes. I found a few bugs in what I had before, so I have reworked things quite a bit. I have also made some significant deviations from the RoboIO source code at the I/O level.
For those unaware, much of the configuration of Roboard occurs through the PCI Configuration Registers at addresses CF8h and CFCh. Address CF8h refers to a specific configuration register, address CFCh refers to the data of the specified configuration register. These registers must be accessed as 32 bit registers. In the RoboIO code, a write is performed by setting CF8h as appropriate, reading 32 bits from CFCh, masking the intended bytes from the value of register CFCh, and adding the intended “Write” value when the value to be written is less than 32 bits. To sum it up, it performs WriteDWord (setting CF8h), ReadDword (reading CFCh), WriteDword (setting final CFCh value) for any North Bridge or South Bridge configuration registers less than 32 bits.
I have changed this entirely. ALL access to North Bridge or South Bridge now occur as 32 bit accesses. In short, I’ve taken a few 16 bit and 8 bit registers and grouped them into 32 bit registers. Since I already maintain the value of the register in the code, there is no need to re-read the Configuration Register’s value before writing it. The ONLY scenario where this would cause any sort of problem is if you have another program modifying the registers outside of this application at the same time (which isn’t a good idea anyway). So, the process for writing to the PCI Configuration Registers is now just WriteDWord (setting CF8h), WriteDWord (Setting CFCh).
As for accessing these “combined” registers, the functionality is broken out as separate functions in my classes, so this actually makes things simpler in the register and I/O code.
I have verified in several different ways that I can set configuration options, save them to an XML file, load them back from that XML file, and write them all back out to their proper registers. In short, I can configure the board’s configuration registers as I like, save them, and load them back, as easy as clicking a few buttons (literally, this is what I have in the app right now, see screenshot below).
So, setting the Servo (PWM) clock frequency, Servo Base Address, Servo Base Address Enable, and any other registers, is as simple as selecting or entering a few values in this configuration tool, then saving the configuration, and loading it later to use. Did I mention that I built this class such that I will be able to synchronize threaded calls to registers and functions? All I should have left to do is add a few Synclock blocks around a few sections of code.
Performance… Performance has been my primary concern from the start. Granted, this isn’t as fast as you’ll get from an unmanaged C++ application, but I can perform 8 bit writes in about 1.6 microseconds, and perform an 8 bit register read and retrieve a bit value from a function in about 1.9 microseconds (accessing the value without re-retrieving it from the physical register is about 0.06 microseconds). I’m using an unmanaged DLL to access RDTSC to get the CPU clock ticks to time these results, so they’re about as accurate as you can get, if not padded slightly by calling RDTSC itself. I even timed direct reads and writes, getting something like 400 nanoseconds overhead (time added for my register classes) on a write operation. I think, for the .Net framework, that’s not bad.
A 32 bit read off the North Bridge occurs in around 3.26 microseconds (slightly less than double the time for an 8 bit write to direct registers, but then that’s using two calls, one to write to CF8h, then one to read from CFCh).
I may be able to eek out a bit more performance, but I doubt it. The code is really clean once loaded, as most of the work is “front loaded” in order to keep running performance high. What I mean by this is that I’m setting as many values as I can when register and function objects are created so that those values don’t have to get recalculated later (costing more CPU time while running). Such values include value masks for register functions.
With this configuration tool, I’ve even managed to set the PWM / Servo parameters manually to move an attached servo. To do so programmatically will be a breeze. All I will need to do is set up the configuration, load it, then access the proper registers and make the proper changes to them to move servos (just changing HREG and LREG for each servo after the configuration is loaded). I can’t get it to be any easier than that!
The screenshot below is what things look like now. Highlighted values are value mappings that match Register values. Save XML saves the current Configuration to an XML file. Load XML loads the configuration into the application. Load Db loads the Configuration from a database (where I have a table that I use to do a clean load of the Configuration- Load DB is VERY VERY slow, and will not be used except for addition or removal of registers (such as to remove registers I simply don’t use for performance’s sake, or add registers I find later- using the Db is easier than editing the XML file directly). Read All reads all registers, starting with base address registers, then populating registers using the identified base address. Write All writes all register values back out, this will become the last step of LoadXML in code. Write is a test button for clocking performance, this was for a write to Chip Select for the External SPI Chip Select Register at the time I captured this image.
The tree can be manipulated live, meaning if you click to set a GPIO bit to On and it is in Output mode, you will output a 1 on the corresponding pin (verified with my DMM at 3.3 volts for High, 6.2 mV for low). The tree itself responds slowly (a second or so to update), and I am accessing it over Remote Desktop Connection, so that may add to it, but it’s a big tree, and it’s slow. For setting things up, it’s fine. The Treeview, or any Windows Forms UI for that matter, won’t be used in the running ‘bot app anyway.
Another tidbit, the Register classes are separate from the UI stuff, so the Register classes are already standalone by design, the tree is just a way to visualize and manipulate what’s IN the register classes.
Take Care,
Paul
A progress update:
I have made significant progress in working on the Register classes. I found a few bugs in what I had before, so I have reworked things quite a bit. I have also made some significant deviations from the RoboIO source code at the I/O level.
For those unaware, much of the configuration of Roboard occurs through the PCI Configuration Registers at addresses CF8h and CFCh. Address CF8h refers to a specific configuration register, address CFCh refers to the data of the specified configuration register. These registers must be accessed as 32 bit registers. In the RoboIO code, a write is performed by setting CF8h as appropriate, reading 32 bits from CFCh, masking the intended bytes from the value of register CFCh, and adding the intended “Write” value when the value to be written is less than 32 bits. To sum it up, it performs WriteDWord (setting CF8h), ReadDword (reading CFCh), WriteDword (setting final CFCh value) for any North Bridge or South Bridge configuration registers less than 32 bits.
I have changed this entirely. ALL access to North Bridge or South Bridge now occur as 32 bit accesses. In short, I’ve taken a few 16 bit and 8 bit registers and grouped them into 32 bit registers. Since I already maintain the value of the register in the code, there is no need to re-read the Configuration Register’s value before writing it. The ONLY scenario where this would cause any sort of problem is if you have another program modifying the registers outside of this application at the same time (which isn’t a good idea anyway). So, the process for writing to the PCI Configuration Registers is now just WriteDWord (setting CF8h), WriteDWord (Setting CFCh).
As for accessing these “combined” registers, the functionality is broken out as separate functions in my classes, so this actually makes things simpler in the register and I/O code.
I have verified in several different ways that I can set configuration options, save them to an XML file, load them back from that XML file, and write them all back out to their proper registers. In short, I can configure the board’s configuration registers as I like, save them, and load them back, as easy as clicking a few buttons (literally, this is what I have in the app right now, see screenshot below).
So, setting the Servo (PWM) clock frequency, Servo Base Address, Servo Base Address Enable, and any other registers, is as simple as selecting or entering a few values in this configuration tool, then saving the configuration, and loading it later to use. Did I mention that I built this class such that I will be able to synchronize threaded calls to registers and functions? All I should have left to do is add a few Synclock blocks around a few sections of code.
Performance… Performance has been my primary concern from the start. Granted, this isn’t as fast as you’ll get from an unmanaged C++ application, but I can perform 8 bit writes in about 1.6 microseconds, and perform an 8 bit register read and retrieve a bit value from a function in about 1.9 microseconds (accessing the value without re-retrieving it from the physical register is about 0.06 microseconds). I’m using an unmanaged DLL to access RDTSC to get the CPU clock ticks to time these results, so they’re about as accurate as you can get, if not padded slightly by calling RDTSC itself. I even timed direct reads and writes, getting something like 400 nanoseconds overhead (time added for my register classes) on a write operation. I think, for the .Net framework, that’s not bad.
A 32 bit read off the North Bridge occurs in around 3.26 microseconds (slightly less than double the time for an 8 bit write to direct registers, but then that’s using two calls, one to write to CF8h, then one to read from CFCh).
I may be able to eek out a bit more performance, but I doubt it. The code is really clean once loaded, as most of the work is “front loaded” in order to keep running performance high. What I mean by this is that I’m setting as many values as I can when register and function objects are created so that those values don’t have to get recalculated later (costing more CPU time while running). Such values include value masks for register functions.
With this configuration tool, I’ve even managed to set the PWM / Servo parameters manually to move an attached servo. To do so programmatically will be a breeze. All I will need to do is set up the configuration, load it, then access the proper registers and make the proper changes to them to move servos (just changing HREG and LREG for each servo after the configuration is loaded). I can’t get it to be any easier than that!
The screenshot below is what things look like now. Highlighted values are value mappings that match Register values. Save XML saves the current Configuration to an XML file. Load XML loads the configuration into the application. Load Db loads the Configuration from a database (where I have a table that I use to do a clean load of the Configuration- Load DB is VERY VERY slow, and will not be used except for addition or removal of registers (such as to remove registers I simply don’t use for performance’s sake, or add registers I find later- using the Db is easier than editing the XML file directly). Read All reads all registers, starting with base address registers, then populating registers using the identified base address. Write All writes all register values back out, this will become the last step of LoadXML in code. Write is a test button for clocking performance, this was for a write to Chip Select for the External SPI Chip Select Register at the time I captured this image.
The tree can be manipulated live, meaning if you click to set a GPIO bit to On and it is in Output mode, you will output a 1 on the corresponding pin (verified with my DMM at 3.3 volts for High, 6.2 mV for low). The tree itself responds slowly (a second or so to update), and I am accessing it over Remote Desktop Connection, so that may add to it, but it’s a big tree, and it’s slow. For setting things up, it’s fine. The Treeview, or any Windows Forms UI for that matter, won’t be used in the running ‘bot app anyway.
Another tidbit, the Register classes are separate from the UI stuff, so the Register classes are already standalone by design, the tree is just a way to visualize and manipulate what’s IN the register classes.
Take Care,
Paul