bTree Database System:

tags: c database system

This site has a nice balanced b-tree (and database) system written in C that you can compile in with your C or C++ programs and use.

Although this site has the source code for the database and the b-tree, it also became the repository for my entire web-development system in C. So, this source code library contains everything you need to build lightning fast web sites using C!

My search for a database and balanced b-tree index system to use from C was a long one. After years of searching, I finally ended up writing my own, and this is what this web site is all about.

My first attempt was a C database and btree system called XXXX. This was a really good one, but it seem VERY complicated to use. The massive array of function calls and things you had to do to get it all set up was daunting. And as hard as I tried, I never could get it to port from Windows to Linux. So my search continued!

Then I got a book called "C Database Development" written by Al Stevens. A REALLY good book by the way. I used the source code from the book, and it worked OK, but I kept hitting bugs, and I COULD NOT UNDERSTAND the source code. So I wasn't able to fix any bugs. The book was really good, but it did not entirely explain the details of the source code. And the source code was written in the old-style, tight, really-small-variable-name C style. So I wasn't able to understand it enough to make any fixes.

FINALLY I bit the bullet and decided to write my own. I had already written my own database system that ran on top of Al Stevens' b-tree code, so that was done. All I needed to do was TRULY UNDERSTAND how the balanced b-tree worked, and implement it.

So after massive studying of books, web pages, and so on, I FINALLY managed to figure out how the btree algorithm works. (OK, I may be a little slow, but for me it was hard). But it's nice to have source code that can insert a node into a balanced B tree, and I actually understand what it's doing!

Along with the B-tree modules, I also have a database system that runs on top of it. I have a feeling most people that want the b-tree system will just use the b-tree code and not use the database (they may have their own database system in mind to run on top of the b-tree).

One of the great uses for this btree and database system could be in small device development, such as smart phones and tablets. Yes, there is SQL lite and so on, but there are many reasons to JUST HAVE YOUR OWN. Like if you just want a little app that is TOTALLY SELF-CONTAINED, doesn't need ANY OTHER INSTALLS. You don't have to install anything, you don't have any REQUIREMENTS, you just run the little executable and it can read and write massive amounts of data!

C or C++ -------- I started out by implementing the database and the btree in C++, but then I decided to upgrade to C for various reasons: - C is FASTER - C binaries are much easier to distribute under Linux. If I compile my code on one X86 linux system, it seems to work on most others! (I'm talking mainly about web hosting services).

So my current system is in C, although I still have the source code for the C++ versions. The C++ versions are mostly identical, but I've made a few bug fixes and enhancments in the C version, so it's definitely better to use the C versions.

Server vs. Imbedded ------------------- Most of the free databases out there, like MySQL or PostGres, are SERVERS, meaning that the database program runs as a process, and you "connect" to that process through a port (sockets) and communicate back and forth to the server via the SQL language. This is nice, and best for most applications, however, sometimes it's nice (AND massively faster!) to have the database logic linked RIGHT INTO YOUR APPLICATION. You gotta believe me, this thing FLYS compared to connecting to a database server. And you don't have to worry about "connection pools", connection limits, etc.

Of course the huge DISADVANTAGE is when you have MULTIPLE USERS. But even here, after mainy painful years of using this system and fixing multi-user bugs, I finally have it working pretty well even on multi-user web sites.

What is the problem with embeddedd database logic / multiple users? Mainly, when multiple users try to update the table or index at the same time. Getting atomic level locking that actually works on both Linux and Windows was a major pain, but I THINK I have it solved! (It would be really interesting to have someone try this out on a HUGE web site and see how it stands up. Some of my web sites, mainly the football pool, might have a few hundred users hitting it at the same time. I wonder if it would die under the weight of thousands?).

One big mitigation of the multi-user problem on web sites is using FAST CGI. If you run your application on fast CGI, usually you get ONE process sitting there running and handling most of your users. Instead of a process for EVERY user. So the web server won't kick off a new process each time. Your database is only opened once, even if there are 5 or 10 users online. It's almost like a SEPARATE database server! (Like mySQL).

One thing about web sites that you may not know and you may learn the hard way: users can kill/restart your process multiple times by hitting the ESC key, then hitting ENTER. A good way to try and break a web site. So you hit the "update data" button, and you don't see your report right away, so you hit ESC. Under Linux and CGI, your process just simply gets killed! (My database system now handles this gracefully!). Then you hit ENTER to execute it again. You could do this, say, 15 or 20 times in a minute. The processes that you killed may not die right away. So you could have, 5 or 10 processes all running and trying to update the same data at the same time! I have managed (again, learning the hard way), to get this database/btree system to handle unexected process kills. Note: the system does NOT handle "transactions" (you start a transaction, do your updates, then end the transaction: if it's not completed, everything is rolled back). The main goal, which I think I have acheived, is to keep the indexes in sync with the database: e.g. if you hit "escape" in the middle of an update, the indexes won't get corrupted.

Of course, if you are using this code to run on a smart phone or some kind of device like that, you don't have to worry about multiple users!

The direction going forward for web development is just html, javascript, and web services. Silverlight, although not yet acknowledged by Microsoft as dead, is, in effect, dead. So is Flash. So if you ask, "so, what, it's just a .ASPX page"? NO. "Is is just a .jsp page?" NO. It's just an .HTM page!!! Going forward, it will be HTML (or HTML5), javascript, and web service calls. This makes software like myBtree even more valuable! Because, since your back end is just a web service, (and a web service doesn't have to be fancy and use SOAP, although it could), then a C program running under fastcgi would be the ULTIMATE web service for the back end!

What about compiled PHP? -------------------------

PRICES ------ Let me start off by saying that this IS an open-source system, in that you GET the source code with it. But it's NOT a free system, in that I charge $195 for the full source code and all supporting modules/programs. You'll get a license and you get the full use of the source code, including the right to modify it. You just don't have permission to re-sell the source code. You can use it in any application with no license fees or anything. It's just the $195 fee, and that's it.

If you buy this system, you get the whole thing: the database system, the underlying btree system, and the web development system (the "htm" class), as well as a whole bunch of supporting source code (the string system, the sockets system, and generic useful functions). You are probably only going to want to use small parts of this (probably just the btree, MAYBE the database system, and even MORE MAYBE the web development stuff).

WEB DEVELOPMENT IN C --------------------

OPEN SOURCE ----------- I'm not an expert in open source licenses, but the concept behind my little source code suite is that, you CAN make modifications and import it back into the main project. You can modify it and just keep those to yourself, starting your own branch, or you can get your modifications back into the main library so that you can keep taking advantage of further updates.

If you DO decide to make modifications and keep them to yourself, then you're on your own from that point on!

As far as importing modifications back into the system, you'd need to let me know before you start what you plan on working on, and I'll let you know if that would be something I would accept back into the source code library.