The Lazy Programmer

March 2, 2010

Berkeley DB Viewer

Filed under: Database, Programming — ferruccio @ 11:27 am
Tags: , ,

I’m currently working on a project which uses Berkeley DB (BDB) as it’s data storage engine. I can’t say enough good things about BDB. It has proven to be a very fast and flexible way to store and retrieve data, it is very easy to use and the documentation is absolutely top notch.

One issue I ran into, though, is that there is no good way to examine the databases for debugging purposes. Initially, I used the provided db_dump command-line tool, which was fine for small databases. Db_dump dumps the entire contents of a database, which was OK when I was dealing with databases with only a few records. But now I am working with databases with thousands and soon millions of records. Db_dump just won’t do.

I did a bit of googling but there didn’t seem to be any viewers available for Berkeley DB, so I decided to write one. I was going to do it in C# and WPF because that’s what I’m currently using. But I decided that, since BDB is cross-platform, a viewer for it should also be cross-platform. So I decided to use Qt to build the viewer.

After a couple of nights of coding, here is the result:

bdbvu

The interface is pretty simple. There is a button for opening a BDB database file. Once a database is opened, the “Database” combo-box let’s you pick which sub-database to view (in BDB terms a database is a key/value table and a single file can contain multiple “databases”). The list view on the left shows all the keys in that database and the view on the right shows the contents at that key. Both keys and values are formatted the same as db_dump -p -k formats them.

I put the source code for Berkeley DB Viewer (bdbvu) on Google Code. You can grab it at: http://code.google.com/p/bdbvu/.

To build it you need two things:

  1. Berkeley DB, installed and built with C++ support.
  2. Qt4.

Once you have those set up and you’ve downloaded the code, you will probably need to change the Qt project file to reflect your environment. I may enventually provide binary downloads, but I don’t have the time for that right now.

Finally, a few things to keep in mind:

  • I’ve only tested this with DB_RECNO and DB_BTREE databases, since that’s what I’m using at this time. But, there’s no reason it shouldn’t work with DB_HASH and DB_QUEUE databases as well. (Where have I heard that before?)
  • I’ve only tested it with databases with embedded databases, not with stand-alone databases. I know it won’t work with stand-alone databases because the code for that is a no-op.
  • I’ve only built and run this on OS X. Theoretically, it should work on Windows and Linux as well, but you never know until you try it.
  • When you open a database, it will load all it’s keys into memory. This seems to be pretty quick (a couple of seconds for a 3000+ record database on my laptop) but I may have to change it to use a more scalable method in the future.

That’s it for now. I hope you find this useful.

For now check out this interesting project:

DecentWorld NFTs






August 9, 2009

Dynamic C++ Update

Filed under: C#, Dynamic-Typing — ferruccio @ 2:51 pm
Tags: ,

I’ve been tinkering with my Dynamic C++ project on occasion in order to get it to build successfully under OSX without much luck. Most of it built just fine, but there were a bunch of places where the boost::variant::apply_visitor() function was giving me all sorts of grief.
The original problem was that I was passing an instance of a locally defined struct as the functor argument to apply_visitor(), such as:

unsigned int var::count() const {
	struct count_visitor : public boost::static_visitor<unsigned int>
		unsigned int operator () (null_t) const { throw exception("invalid ...
		unsigned int operator () (int_t) const { throw exception("invalid  ...
		unsigned int operator () (double_t) const { throw exception("inval ...
		unsigned int operator () (string_t s) const { return s.ps->leng ...
		unsigned int operator () (list_ptr l) const { return l->size(); }
		unsigned int operator () (array_ptr a) const { return a->size(); }
		unsigned int operator () (set_ptr s) const { return s->size(); }
		unsigned int operator () (dict_ptr d) const { return d->size(); }
	};

	return boost::apply_visitor(count_visitor(), _var);
}
							

(more…)

June 15, 2009

Dynamic C++

Filed under: C#, Dynamic-Typing, Programming — ferruccio @ 10:01 pm
Tags:

A while back, I started building a PDF parser in C++. I had been using the Adobe PDF IFilter to extract text from PDF files in order to index the content, but I wanted to be able to be able to also extract formatting information so I dug into the PDF format. The PDF format itself is fairly easy to parse, but the contents can be quite complex.

The PDF format consists of a series of objects, expressed in a simple syntax based on PostScript. There are primitives such as strings and numbers, and there are collections (arrays and dictionaries) which can contain both primitives and containers. You can see how things quickly become complicated when you have dictionaries containing arrays containing other complex objects.

(more…)

May 28, 2009

<XAML fest>

Filed under: /NET — ferruccio @ 7:01 am
Tags:

I just finished XAML fest, a two day introduction to SilverLight, XAML and Expression Blend.  The event was held at Microsoft’s New England R&D  Center in Cambridge,  Massachusetts. The class centered around building a small web app using SilverLight. A lot of time was spent learning how to use Blend to build user interfaces.

Having spent a good portion of my career building Windows apps, I’ve had the opportunity to create UIs using the Win32 API, OWL, MFC, WTL and wxWidgets. I’ve dabbled in WPF but never did much with it since I’ve been spending most of my free time tinkering with Cocoa and Cocoa-Touch. What I really like about using XAML is that you can lay out an entire interface, including a lot of behavior without writing a single line of code.

(more…)

April 5, 2009

A Python snippet for reading binary data

Filed under: Programming, Python — ferruccio @ 7:31 pm
Tags: ,

I’ve been experimenting using Python to read data from binary files and started to notice the following pattern in my code.

  1. Read a block of binary data.
  2. Use struct.unpack() to break out individual fields.
  3. Create a dictionary from those fields using the appropriate key names.

(more…)

January 5, 2009

Returning multiple values from a function in C++

Filed under: C#, Programming — ferruccio @ 10:22 pm
Tags: , ,

Ideally all functions should return just one value. There are many times, however, when returning more than one value makes a function so much more convenient. The classic example of this convenience is file input. When we read data from a file we want to know two things: Did we reach the end of the file? and if not, what is the next piece of data from the file.

Sometimes, we can encode multiple return values into one. For many of us, the first C idiom we learned from K&R is processing input a character at a time:

int ch;
while ((ch = getchar()) != EOF) {
    // do character processing...
}

This works because the EOF macro was set to something outside the range of valid characters (usually -1). While this approach can work fairly well for simple cases, it quickly breaks down as the types we wish to return get more complex.

(more…)

November 18, 2008

Functional C++

Filed under: C#, Programming — ferruccio @ 9:56 pm
Tags: , ,

It’s been awhile since my last post. It was near the end of a product shipping cycle and, well, you know how that goes. Then I got a bit addicted to stackoverflow and spent way too much time each night reading and answering  questions. Eventually, I started a couple of side-projects which may eventually yield something interesting to write about.

Anyway, a little over a month ago, I answered a question on Stack Overflow titled “What is the one programming skill you always wanted to master but haven’t had time?” I didn’t have to think much to come up with an answer to that: Functional Programming.

I understand some of the fundamental concepts behind functional programming and occasionally I dabble a bit with LISP or I read a bit more of SICP but the practical applications of FP have been elusive to me.

(more…)

August 10, 2008

Adding a lock() statement to C++

Filed under: C#, Multi-threading, Programming — ferruccio @ 9:43 am
Tags: , ,

One of the things I like about C# that I miss in C++ is the lock() statement. It provides a simple way to control access to an object across multiple threads.

When I first started writing Windows apps, I used the Windows critical section API directly. After getting tired of EnterCriticalSection/LeaveCriticalSection, (I suspect like everybody else) I created a CriticalSection class that was a simple wrapper around the raw API.

Eventually, I got around to using mutexes from the Boost Thread class. You can create a scope guard whose destructor releases its referenced object, but it still lacks the simple elegance and clarity of being able to say:

lock (some-object) {
    do something to some-object
}

(more…)

August 3, 2008

Formatting Output with Boost

Filed under: C#, Programming — ferruccio @ 9:20 pm
Tags: , ,

Sometimes a GUI is overkill for a project. You just need a simple tool to do some task. Perhaps it needs to be scripted. So you whip up a console mode program and you eventually have to output something. At this point, many developers will simply ignore the C++ iostreams library and reach for good old printf(). I can certainly understand why. The iostreams objects are easy enough to use for simple formatting tasks. However, when you need to do something more sophisticated, you will often find yourself digging through reference material, muttering "this should be easy…"

(more…)

Next Page »