Skip to content

Gnu Assembler Ported

I have now managed to port the Gnu Assembler, as, to IanOS. I consider this a major achievement as it shows that the system is capable of running real, useful programs. The next step will be to port the linker, ld, at which point the OS can be considered to be self-hosting. But true independence will call for a port of GCC also. Having managed to port as, I think this is a real possibility. At that stage it should be possible for the system to compile itself (and then it will be time to look at getting it running on real hardware).

Apart from being a very useful step in its own right, this task has been extremely useful in ferreting out various bugs, particularly in the filesystem task. Whatever tests one runs the real trial is when you apply these routines in earnest. I’ve uploaded all the changes to the OS to GitHub and at some stage I’ll document the procedures necessary for these ports; but I’m too keen to press on to devote much time to that right now.

And it’s a nice sunny day, with the Monaco Grand Prix on TV soon. Perfect!

Future Direction

I’ve come to something of a dilemma. The original intent of my website was to provide a simple, easily understood, example of a basic operating system. To keep this simplicity I have written my own versions of library routines such as printf, strcpy, strlen, etc., and my own utility programs such as ls, pwd, etc. I could continue this way, getting ever more complicated and reinventing many wheels, or I can attempt to make the system more comprehensive by making use of the many GNU programs that are available. In particular I would like to have an assembler and C compiler running on IanOS to make it self hosting.

Writing an assembler (relatively easy) and a C compiler (much harder) would be interesting projects, but it’s not really what I want to concentrate on right now. I’d rather look into making the basic OS more robust and look into subjects such as networking, supporting PCI and USB devices, and other core enhancements. Perhaps even getting the system to run on a real physical machine. I don’t mind reinventing some wheels (after all, that’s what writing your own OS is all about) but there is a limit to what one person can reasonably do; and, certainly at the moment, I want to keep this as a one-person project.

To facilitate this, I’ve ported newlib (a version of the standard C library) to IanOS, and it seems to be working quite well. This is a first step towards porting binutils and gcc (but quite a bit more work to be done yet) and provides many useful, well debugged, functions. Unfortunately this adds a level of complexity that detracts from the original aim. I have decided that, at least for the time being, I am going to freeze the version presented on my web site, with perhaps minor changes as I discover bugs, to the pre-newlib version. This is the “master” branch on github. I have now produced another branch, “newlib”, which contains the changes that I have made to aid porting and new versions of the tasks rewritten using the standard C library.

If anyone wants to experiment with newlib I suggest they download the “newlib” branch and then look at OSDevWiki (you should look at that website anyway if you have any interest in home-brew OSs), where instructions are given for producing a toolchain aimed at your own OS and for porting newlib. Should there be enough interest in this I may, in time, produce a detailed set of instructions for doing this.

Packing Structs

I’ve been sidetracked a bit of late looking at porting newlib (newlib is a standard C library – implementing it provides many useful functions and opens the possibility of porting GNU software). One little potential pitfall arose as a result of this.

By default gcc aligns the elements of structs to provide best performance. Depending upon the layout of the variables in the struct, this may lead to some empty padding bytes being inserted between variables (with the result that sizeof(struct x) will give a different result to adding the sizes of the elements of the struct). Previously I had been using the -fpack-structs flag to ensure that there were no such gaps, but this lead to problems with newlib, so I removed the flag. And then the compiled system would not run.

The reason is that some structs need to be packed; examples are the structs that represent the physical layout of disk structures, such as the MBR, partition tables, boot sectors, and directory entries. The simple solution is to add “__attribute__((packed))” at the end of the struct definition:

struct
{
}__attribute__((packed));

I have done this so now structs are not packed by default, but particular structs – that need to be – are. I have also rearranged the order of variables in most structs to put the longer ones first and the shorter ones last (although this, obviously, can’t be done with the previously mentioned disk structures). This reduces, or even eliminates, the need for padding bytes.

I’ll post more details about the implementation of newlib in a while. At the same time I’ll update github with the new code.

Internal Clock

A clock is a very useful feature of any operating system, allowing files to be stamped with their creation and modification dates. (Without this utilities such as make wouldn’t work.) I thought this would be difficult to implement, but it turned out to be really easy.

So there is now a clock which reads the time from the CMOS real-time clock on startup and then keeps time via the timer interrupt. Once a day, when midnight rolls over, it will read the RTC again to make sure it is still synchronized. File functions now implement the last modified date and the ls function displays this information. The RTC routine isn;t absolutely bullet-proof, so I’ll improve it in due course.

As part of this I expanded the printf() (and kprintf()) function so that it recognizes the basic formatting flags and minimum width specification.

Zeroing Memory Pages

When working on an (not entirely successful) attempt to improve the efficiency of the routines managing the PageMap table it became apparent that there were problems when reusing memory pages. In particular, if the page represented one of the various paging table entries it might be interpreted as valid when reused; obviously this is wrong.

I have changed the routine in the dummy task which releases pages so that it also zero-fills them. As mentioned in my last post, zero-filling a new page is always a good idea. In this particular case it ensures that the contents can’t be interpreted as valid page table entries. For efficiency, the zero-filling is done by a small assembler routine (in tasking.s)

At the same time I improved the efficiency of the zero-filling routines in startup.s (not that it particularly matters with code that is only executed the one time, but it makes the code neater too. I like neat code).

Memory Map

I’ve made a few changes to the memory map to tidy things up a bit. The main change is to move fixed areas, such as ROM and video RAM, to virtual addresses so that the space from 0x1100 to 0xFD000 is contiguous for use as a heap by the kernel. This should be plenty of space (don’t they say that about all limits) as kernel heap usage is not that much, and tends to be transitory.

The big consumers of heap memory are the disk buffers, but these use addresses in the data area of the FS task which has plenty of free space. As far as possible all of these fixed limits are specified in memory.h (and memory.inc – I’d love to do away with this file but haven’t yet thought of a way to use the same header for C- and assembler-code) and can be easily redefined there if necessary. There are probably a few absolute references to addresses in the code – I’ll change them if and when I catch them.

I’ve also changed the initial code, in startup.s, to zero all uninitialized kernel data before its first use. It makes sense to know the state of variables before they are used – at least it gives a consistent setup – and zero is a good value to choose. Null pointers show up more readily than ones containing arbitrary values.

All of these changes are reflected in the source code on github, and will make their way onto the website code when I have time. (You can take this sentence to be true for every change that I post about from now on.)

Version Control

Up until now I have been using Subversion for version control, but I am persuaded that git may be a better choice. As a result I have created a git repository of the source code which is available here. I’ll try to keep this repository updated with any changes that I mention here (and any others that I make).

I’m very new to git, so forgive me if I don’t get it right first time.

Static Variables

I have now made a change to get rid of some fixed variable arrays. Previously the task table was a fixed array of limited size, as was the table of standard message ports. I have now changed these to dynamically allocated linked lists; this gives these items the ability to grow unlimited (except by memory constraints).

As always, making these changes brought to light a couple of unnoticed bugs which had been causing system instability and also a horrendous memory leak in the kernel. These have now been corrected, leading to more stability. The memory allocation routine has also been improved and simplified. Previously I kept a record, in struct MemStruct, of the pid that allocated memory. The intention was to allow the kernel to automatically deallocate any unused memory once a process ends. On reflection this was not a particularly good idea. As the kernel allocates and deallocates all memory on its heap it seemed better to just make sure it didn’t leave any unused memory allocated (which I think I have managed so far). As for user memory, that is deallocated automatically when the pages are released by the DummyTask process.

I haven’t managed to crash the system with simple tests since making these changes, but I’m sure there are many bugs left lurking.

The source code download on my website now reflects these changes (and all previous ones mentioned), but I have yet to update the documentation and listings; I’ll do this in due course.

Growing the Stack

For a long time I have meant to look at code to detect stack underflow and grow the stack accordingly, but never got round to it. This was brought to a head when writing the disk buffering code; I was getting some unexplained errors that turned out to be stack underflows due to excessively recursive function calls (which I solved separately).

I’ve always thought this would be quite hard but I had a look, and it turned out to be very easy. I’ve now added code that will detect an underflow of the user stack (which is the only one that we don’t have tight control of) and allocate a new page if needed. I haven’t yet incorporated it in the code on the web, but for those interested, here is the revised page fault handler (where all the work is done).

You will need to make one more change for this code to work. Currently the user stack is located at the page starting 0xA00000. The page immediately before that is actually part of the kernel stack address space and is not accessible by user programs. So UserStack needs to be changed in “memory.h” and “memory.inc”. I’ve made it 0xBFF000; that gives quite a few available pages for the user stack; if it grows bigger than this there’s probably something wrong. (But you can set aside as much of the virtual address space as you like.)

Disk Buffering

I took a little time out from writing an editor to implement a fairly simple method of buffering disk access. Obviously this is desirable as direct access to the hard disk is much slower than memory access.

Previously, whenever a sector was required the OS just loaded it from the disk. Now the sector is put into a buffer which is managed by a binary tree. When the OS requires a sector it first checks this tree to see if the sector is already in it. If so it returns a pointer to the existing buffer; otherwise it loads the sector and adds it to the tree.

In the current implementation the buffer is never freed. That’s not such a problem as the disk is small, it has very little data on it, and the system never runs for very long anyway. In due course I will implement flushing of old buffers from the tree. Not only is this necessary for memory reasons, but at some stage the tree will become so large that it is less efficient than reading direct from the disk. I’ll probably use a “least recently used” algorithm.

When the buffer is written to it is marked as dirty. At crucial points the OS checks for dirty buffers and writes them to disk. All of this code is to be found in filesystem.c (plus btree.c, which contains routines to manage the binary tree).

Work on the editor progresses. It’s doing quite well now but still has a few bugs.

Whilst working on these two points a few bugs have arisen, which have now been corrected. (One of these involves a slight change to the format of executable files in “tasks”.) The download, and the source listings, now contain all of these changes.