2007-12-21
Letting PHP objects behave like arrays is quite powerful. So is being able to chain together methods in the spirit of jQuery like so:
$a = new user();
$a->firstName('Jim')->lastName('Lansing')->address1('123 Birch Blvd.');
See what I mean? It's just elegant. Now if only I could string them off the original constructor ... but that would require different instantiation syntax and the use of exceptions, which I still don't particularly like after all these years.
So, updates are coming. The above advanced concepts will be present in my soon-to-be-released ORM class and the next version of Autoform. Autoform hasn't been updated in too long, and the version currently in the works probably shouldn't be called Autoform anymore because it looks so different. No, I'm not changing the name.
2007-11-03
My day job is PHP, as are most of my hobby efforts, so I need a break. For the past 1.5 years I've been tracking Xynth, and for the past month I've been looking at the code and poking around. It works great on my 233Mhz, 64MB RAM, 2.8 hour batterly life Toshiba Satellite pride-and-joy hardcore minimalist testbed of a laptop. It tests a coder's ability to write efficient, non-bloat software to the max, and I love it. Enough gushing.
So Xynth runs great and compiles within 15 minutes. Xorg hasn't a chance. The question now is, in what way do I contribute? Xynth itself is undaunting, but helping Alper with the new GTK+ port is beyond my experience. Plus, larger GTK apps would drown my laptop, so I don't plan to use them. Xynth needs a better window manager so perhaps I'll port something existing, or try out my own ideas for a window manager that maximizes productivity. More to come.
2007-10-14
Autoform has undergone more major changes; a release will soon be upon us. Switching to class-based field types has enabled me to fix the previously broken field array capability. I have to say that the field array portion now functions amazingly.
Some more benefits of class-based field types:
- It allows one to modify a field even after it has been added to an instance of autoform
- Autoform is now infinitely more extensible thanks to OO inheritance
You'll also be able to iterate directly over an autoform instance as it will iterate over the fields present in the definition.
2007-08-01
It bothers me that current versions of Autoform have limited support for field arrays so I'm doing something about it. This means drastic internal changes (yet again), but it will remain mostly backward compatible. Probably around 90% backward compatible. There will be a different way of specifying "multiselect#" type fields, but the current method probably seems quite random anyway.
Also, a field definition will become a class:
$n = new textField();
$n->label = 'First Name';
$autoformInstance->name = $n;
... rather than the old way of using associate arrays:
$autoformInstance->addField(
'name',
array('type'=> 'text', 'label'=>'First Name')
);
I like the new syntax/method better, so hopefully others will too. Oh, and you'll able to use your previous associative-array field definitions with 0 changes.
Best of all, using Autoform with HTML templates will now be more powerful. More on that later.
2007-07-24
I spent more than 5 hours working on ORM over the past two weeks. Getting the code to an elegant state is all about internal data structures. I've been tweaking the way to specify the relations between tables, but it could surely use more refinement. The trick is keeping complexity low while still having the necessary information to do CRUD on the relations. I haven't gone the same route as Ruby on Rails (and I don't intend to) so I only have two types of relations: hasOne and hasMany.
Since I want my project to be usable within existing applications I don't want to assume anything about the database structure. And there's no way in hell I'm going to build it around rediculous conventions involving plural table names and singular foreign key fields. That's a waste of code. ORM shouldn't be a test of one's ability to code rules for a portion of the English language.
hasOne Relationship on a user object
Most commonly this would be called from within the constructor of the object extending the ORM object. Defined like this:
$this->hasOne('status', 'user', 'this.statusId=status.id');
The first parameter is the local name given to the relation. So that if you have a user object with the specified relation you can get the name of it's status by: $userObject->status->name.
The second parameter specifies the table to modify when setting/modifying the relation. In this case, the foreign key resides in the user table (and is the statusId field).
The third parameter begins the mappings portion. These mappings will be used in the actual SQL query's where clause to fetch the related data into an ORM object.
The result of this will be that the user object will now have a publicly accessable field called status which contains a status object representing the status of the user object.
hasMany Relationship on a user object
Declared like this:
$this->hasMany('addresses', 'address', 'this.id=address.userId', 'address.name asc');
The first three parameters are the same as with hasOne. The fourth parameter is used in the order by clause when fetching the addresses.
Modifying Relations
Creating, destroying and modifying relations works in simple cases, especially in those involving a single foreign-key field or a mapping table with two foreign-key fields.
With the given hasOne relationship defined on the user object above, one can do the following:
// this sets the related status using an object
// a status record with an id of 3 should exist in the DB
$userObject->status = new status(3);
$userObject->save();
// this sets the related status using the id of a status record
$userObject->status = 3;
$userObject->save();
// remove the relation altogether
$userObject->status = 0; // only works if there is no status record with id of 0
$userObject->save();
With the given hasMany relationship, we can do the following:
// relate to existing address objects
$userObject->addresses = array($address1, $address2);
// relate to the ids of existing address records
$userObject->addresses = array(2, 24, 44);
// modify existing address relations
$addresses = $userObject->addresses;
unset($addresses[2]); // remove the third address in the relation
$userObject->addresses = $addresses;
// remove all related addresses
$addresses = $userObject->addresses;
$userObject->addresses = array(); // this merely un-ties the addresses from the user
foreach($addresses as $a) { // this block actually deletes the address records
$a->delete();
}
2007-06-20
Over the past two days I made some more progress on Knox mainly because I took a break from trying to make it use TESI. Instead, I focused on its use of libiterm. This means I don't have to worry about the details of resizing my virtual terminals (PTYs) since iterm has a simple function for it. Also, since my last writing about Knox and TESI I re-discovered the real way to "resize" a PTY. One must use ioctl() with TIOCGWINSZ and a winsize struct, NOT environment variables.
Today I also found some flaws in my Rectangular Area Division code that results in some off-by-one sizes. Gotta track that down eventually if I intend for VTs to be roughly equal in size.
With my current color-handling code, iterm instances start up with yellow text on a blue background. It's ugly so I secretly hope I did something wrong. I really have quite a bit left to do, like making backspace work in the command window, and sending function, arrow and other keys to the VTs.
2007-06-14
It's time to update Autoform. Using Autoform with a template that will be rendered by a templating system isn't as streamlined as it should be. To fix this, I'm thinking of allowing the XHTML for a form field to be returned by referencing the field name as a class member variable, such as this:
echo $af->email_address;
For getting labels and error messages:
echo $af->email_address_label;
echo $af->email_address_error;
And, for consistency, I'd like to allow fields to be added to a form by a similar means:
$af->email_address = array('type' => 'text', 'label' => 'Email');
This means that you'll have to pass the Autoform object that contains your field definitions to your templator, but that's no big deal.
One other thing I've thought about is creating individual classes for field types, in an attempt to get rid of the rather large nesting of if and switch statements in my buildFieldTuple() method. Sometimes that sort of re-organization can yield better code. What I'm really looking for is to use encapsulation to make the code more readable and extendable. This means it will be easier to extend Autoform with more complex field types such as date selectboxes or an FCKeditor.
Now, I only have to figure out what to do about arrays of similarly-named fields...
2007-05-25
Been itching to release my next version of dbSimple (now called dbFacile) so I've been thinking about additional ease-of-use features.
Don't Quote These Fields
The first pertains to being able to specify which fields should not be escaped or quoted when performing an insert or update on data. (The automatic escaping and quoting is to prevent SQL injection) I've come up with two ways to accomplish this, but first, why would anyone need to do such a thing? Because it may be advantageous to make use of built-in SQL functions, such as those for updating a datetime field or for doing simple math on fields, like so (which would be called when a user logs in to track the number of times they've logged in):
update users set logins=logins+1 where id='2'
First option: Create extra parameter for the insert and update methods that accepts an array of field names that shouldn't be quoted.
$data = array('last_login' => 'now()');
$db->insert($data, 'users', array('last_login'));
$data = array('logins' => 'logins+1');
$db->update($data, 'users', 'id=2', array('logins'));
Second option: Append key names in the data array with '=' (or something else) to prevent quoting (shown in bold to make them stand out):
$data = array('last_login=' => 'now()');
$db->insert($data, 'users');
$data = array('logins=' => 'logins+1');
$db->update($data, 'users', 'id=2');
I prefer option two because it requires less work. Also, calls to insert/update might make use of data from foreign places and unless you also pass the "don't quote" array with the data, you won't know which fields shouldn't be quoted.
Parameterized Where Clauses
This functionality extends protection against SQL injection to where clauses, which are often built from query string or POST data. I've come up with two ways to do this, depending on the need. Both give the same result.
// where clause begins at parameter #3
// it becomes: id='1' and name='something'
update($data, 'table', 'id=? and name=?', $id, $name)
OR
// where clause is the array at parameter #3
// it becomes: id='1' and name='something'
update($data, 'table', array('id' => $id, 'name' => $name))
The first one is more flexible. The second performs an explicit AND between array elements, so you're limited on the logic you can perform (no inequality tests). But you can always construct a where clause string by hand. Lastly, only the second method is a candidate for the Don't Quote These Fields functionality.
2007-05-24
I'm thinking about renaming my dbSimple project because of the name conflict with this project. However, I'm mainly going for ease of use, not code size, so maybe I should find another name. The other project still doesn't have simplified insert or update methods.
In addition to a name change, the upcoming release features code reorganization. I've separated DB-specific code into driver classes. It should make incorporating more Databases much simpler and hopefully will result in speedier operation. Many of the alternatives perform significantly faster on simple benchmarks (rougly 40% faster), so one day speed needs to be addressed.
2007-05-19
For Knox, of course, I need the ability to kill the shell running on a Virtual Terminal and whatever process is running in the shell. This isn't as straightforward as I thought it'd be. First, an overview:
What I need is for each VT to have its own controlling terminal (PTY), on which runs an instance of bash with its own process-group id (same pgid as the PTY). Then hopefully bash's child processes will retain the same pgid so that I can easily kill all of those pgids when Knox needs to exit.
Something's going wrong though. When I give each instance of bash its own pgid, bash is then giving its child processes their own pgids. I'm not sure what's going on here but I don't like it. It destroys my ability to easily kill all VT processes.
2007-05-18
I'm running into strange issues with terminal boundaries on Knox. It's not scrolling properly, and not wrapping the cursor when it runs out of bounds. It's difficult to know whether the issue is within TESI or Knox. I'm leaning toward the issue being in Knox, since I have a simple TESI test application in which scrolling and wrapping at terminal boundaries works properly.
After implementing dynamic creation of Virtual Terminals, I realized resizing pre-exising VTs poses a problem. The only way a textual application knows the height and width of its containing window is by checking two environment variables: LINES and COLUMNS. This means that when I create a new VT I must update the environment variables of the other VTs. But this isn't elegantly accomplished and I'm only aware of 1 partial workaround. It'll only work if all VTs are sitting at a shell prompt, AND if those shells support a command like "export" for updating environment variables. Thus, if all other VTs are sitting at bash prompts, I can send "export LINES=43" to each VT as input (with the apppriate number of lines), and another to update COLUMNS. But if the VT isn't at a bash prompt, the input probably won't be received well. I'd rather implement 0 workarounds, so I might just disable Knox's "create" command if there are existing VTs. If there are 0 VTs, I'll support "create NUMBER" which will create the specified number of VTs. This creation from scratch won't involve any resizing, so the issue is (mostly) avoided. But, it hinders a bit of flexibility. Perhaps it's time to implement VT hiding and the ability to create "virtual desktops" of tiled Virtual Terminals. The I could allow "create" if there are existing VTs, but the command would create a new virtual desktop with the specified number of VTs.
2007-05-14
This morning I got to thinking about the number of bytes of escape sequence TESI was set to buffer and parse at a time. It was set to 128 bytes so I increased the number to 1024. This resulted in a noticable speedup, since it now seems to process all of the available data from the virtual terminal. At 128 you can see Knox drawing almost every line from an instance of top running at .5 seconds. When TESI buffers and parses 1024 bytes of data, you can't. 1024 may not be optimal, or may be overkill ... I'll check libiterm's solution and choose something solid.
By the way, I ran valgrind and gprof on knox to get a feel for the functions that are called the most. This data will come in handy if I need/decide to do further optimization.
2007-05-13
To be honest, the SLang library isn't as feature-rich an alternative to nCurses as I hoped, but it provides me another environment where I can refine TESI, so I'm not complaining too loudly. The biggest surprise is that SLang has no concept of a window (a rectangular region of a console that can be moved, stacked, etc within a parent window), which is one of the most useful things about nCurses. The ability to move a whole region of text around with only 2 function calls is heaven. But, SLang does support colors, other attributes, a type of back-buffering and line-drawing, which are also essential.
Another bit of honesty: I find the input handling routines of nCurses and SLang horribly lacking. Reading keyboard input is such an old practice that I can't understand why there aren't more libraries dedicated to making it painless. What I'm really dying for is an input-handling library that can invoke a callback function in event-driven style when a certain sequence of keys is pressed. After all, the logic that determines if escape was pressed twice within 1 second really deserves to be hidden away rather than spread throughout my program's main loop. Plus, wouldn't it also be great if the portability of being able to check for an input of KEY_ENTER could be combined with the power of being able to get the ASCII sequence(s) that correspond to KEY_ENTER? I think so. The same thing could probably be utilized in GUI applications as well. Might as well make it useful to everyone. However, I'm not so sure I'm up for starting such a project just yet.
2007-05-12
It seems every six months or so I resurrect one of my C projects for a couple days. This time it's Knox. I'll come back to Knox, but for now, let's take a walk.
My January 24 post mentioned SLang and Lua, two languages I had considered getting familiar with as alternatives to C for quicker development. To this day I'm still interested in SLang because of a few things:
- It provides functionality that can serve as an alternative to nCurses, and I like alternatives.
- If you decide to embed an SLang interpreter into your C application, it's very easy to make calls back to some of your C functions. It allows for a "best of both worlds" approach.
So, I considered implementing a hybrid Knox solution using C and an SLang script. The C portion would contain lower level IO/TTY ops, while the SLang script would be the place for higher level UI logic (command parsing, "eye candy"). The idea was to use a scripting language to make the UI portion easier to tinker with. But, I can't find a way to call the text UI functions from within an SLang script, so I scratched the SLang script portion (though I'm sure I'll revisit it in the future) and started on a C SLang-library + TESI version of Knox.
This venture has gone pretty well except that once I arrived at a working prototype, it was terribly slow at running only 1 instance of top. This was quite a different experience from what I had with my libiterm + nCurses version of Knox where I could easily run 3+ instances of top all refreshing at half a second with no noticable lag. It leads me to wonder:
- Is my TESI implementation flawed in some way? How can it be that much slower than libiterm when it works with a simpler set of escape sequences and is much less code than libiterm?
- Do textual applications expect a VT100 type terminal? Will anything else always be less efficient due to their optimization for VT100s?
- Perhaps my incorporation of SLang needs tweaking. SLang's text-drawing functions are supposed to be more efficient than nCurses.
At this point I need to do some profiling to find the slow areas of my code. Since I've never used any profiling tools there's going to be a bit of a learning curve. But, my code is short so things shouldn't be too difficult.
2007-05-06
A couple weeks ago I began creating more formal tests (unit tests?) for my ORM project. It's amazing how many bugs you find when you write code that tests every thinkable way that ORM could be used. ORM is far from finished. It'll get a real name one day as well.
I've found that the most difficult portion of creating an ORM project is dealing with relations. A User object might be related to many Order objects if a User has made purchases in the past. Pulling these related Order objects is fairly easy, but adding a new related Order object is quite difficult, since I don't want to create an overly complex API to accomplish the feat. I'd much rather be able to do something like "$user->orders[] = $newOrder;", which in PHP-speak appends the $newOrder object to an array of orders. It's essentially a push operation onto a stack. I should be able to allow this in one way or another, but the code will surely push my line count over 1000, and it probably won't be pretty. Speaking of line count, I'm still hovering around 500 lines.
2007-05-03
CakePHP. It's a PHP framework that wants to be Ruby on Rails. But I'm finding once again, as is the case with all frameworks, that if you want to do things slightly out of line with the framework's idea of an application you're going to have difficulty. (This is why I don't like frameworks, and would rather be able to plug specifically-purposed tools together to accomplish the goal).
My particular brand of difficulty has come from the need to tweak the way a form is displayed. Thus, I can't let CakePHP auto-generate the form or scaffolding for me. However, this form also has a multi-select box that models a one-to-many type of relationship. The proper way to processing the form and save the related data is beyond my understanding. Seems convoluted. So for now I'm avoiding the issue.
Cake was also supposed to have a built-in permission system. However, it's very immature, and hard to wrap one's head around, so it's stressing me out.
2007-03-17
I've now reached difficult hurdles for my ORM and TESI example projects. The way I see it, ORM requires a couple things for it to be truly worthwhile:
- Ability to treat a table row as an object ... full CRUD. The modifying of the object's member variables should translate to field modifications upon calling save() (or delete()) on the object.
- Ability to fetch a collection of these ORM objects, like when there's a need to treat all customer records as objects for easy handling.
- Ability to read, modify and create relations among ORM objects, which is them translated into database row relationships (whether there are explicit foreign key relationships present or not). One-to-one, one-to-many and many-to-many should be all supported.
Given the above items, I'm striving for an aesthetically pleasing API, as I always do. This means that I want to be able to append ORM objects to a relation in the same way I append elements to an array, using the [] operator. However, PHP's new-fangled SPL has very very limited documentation, and even more limited examples. I need to extend or implement an Iterator, that much is easy and I've done it before, but whether it's possible to overload the [] operator is beyond me. Of course, I should also be able to refer to a relation ORM object by its index in the collection: $user->email_addresses[2]->address
Now, about my TESI example projects. They're intended to be proof-of-concept examples that show:
- TESI works
- TESI is fast
- A TESI terminal is easy to implement
However, I began creating the example programs using GTK+ and Fltk, and I'm finding that neither one is easy to use. I've begun with the plan to have TESI output into a standard text box, akin to notepad on Windows, but the API for moving around in one of those isn't trivial. It's a bitch. In GTK there are TextBuffers and BufferIterators to worry about, and their associated functions are less than intuitive. I hoped Fltk would be better because I constantly see it recommended over GTK, but I'm starting to think Fltk is recommended solely because version 2.x is written in C++. The methods for their TextOutput widget seem to be lacking necessary methods, such as checking whether a row/column location is valid. I'm sure the same can be performed by calling a few other methods, but it wasn't readily apparent to me last night.
I still think that in every Open Source library project there should be someone constantly screaming "aesthetics! intuitiveness! programmer efficiency!", and this person should be in charge.
2007-01-25
Today I reworked my website code for probably the 6th time in 4 years. It's my testbed. I restructured the sections and utilized Object Relational Mapping. I'm concerned about the decreased speed that comes from using ORM and have come up with a simple way to speed things up.
When fetching multiple rows of data, the inclination is to fetch all the rows ids first, then loop over them and construct objects for each of these ids. This results in 2n queries, where n is the number of rows. The query that fetches the ids probably is a bit faster than the query to fetch the rest of the data, but if we combine them in the best case we could cut the time by half. So, instead of passing only the id to the constructor of an ORM class, pass the whole associative array of data. The class can then pull out the id, and merge the rest of the data to the class' internal data store.
$objects = array();
foreach($db->fetch('select * from projects') as $row) {
$objects[] = new projectModel($row);
}
INSTEAD OF
$objects = array();
foreach($db->fetchColumn('select id from projects') as $id) {
$objects[] = new projectModel($id);
}
2007-01-24
dbSimple: The problem with separating code into "driver" classes is that there's a disconnect. Sometimes the driver class needs to do queries of its own, such as getting a list of fields in a table, but the logic to do so resides in the main dbSimple class. Guess this means I have to include a reference in the driver class to the parent.
----
As I wanted to do last Spring, I started to write code to minimize a DFA. But the SLang language doesn't have simple functions for doing searches on arrays. Why have I cared about SLang for so long? Oh, that's right, the SLtty module enticed me. Time to look at Lua.
Lua doesn't have common array (table) functions either. I don't understand why someone wouldn't supplement their language with functions for working with their data types. It's beyond me.
2007-01-22
In addition to ORM I began a comparison of dbSimple to other existing DB abstraction classes. It showcases the reduced amount of code one will write when using dbSimple. This lead to the desire to perform benchmarks on dbSimple, which led to the desire to speed up dbSimple's operation when one needs to iterate over results of a query. After restructuring some code, moving some into "driver classes", implementing an Iterator class, I think I'll have a faster product. But preliminary tests aren't showing the results I hoped for. Perhaps I need to do some research about optimizing PHP code. But is it worth it?
2007-01-18
This past Wednesday night I couldn't sleep so I stayed up most of the night implementing an ActiveRecord-like class in PHP. It's something that I knew was possible for quite a while but never took the time to investigate existing solutions nor to create my own. The span from 2:30am to 6:30am yielded a mostly functional product. Then after my 8:00am class I worked for another hour and added more join functionality.
The benefits of this sort of thing are immense. The idea is that join queries are trivial and they can be performed automatically for you. My class only performs the join queries as you request related data. For instance, you instantiate a user object with id of 1. Only when you try to reference the user's data (first_name, last_name, etc) does it pull the data from the DB. Then, only when you try to access the user's purchases will it pull that data from the DB using join queries.
2007-01-15
In addition to creating a GTK terminal I'm strongly considering the same application using Fltk to compare the intuitiveness of the libraries. Knowing that several people have recommended Fltk over GTK and wxWidgets, I have the notion that the API is probably better designed thus will be more pleasant to work with.
2007-01-14
I began creating a GTK terminal application that implements a TESI terminal using my TESI interpreter code. The idea is to use a GtkTextView widget (which is basically just a text box with font and color capabilities) as the viewport. It's a bit tricky because cursor movements can't be made to a row and column if the cell hasn't been filled previously. In other words, the TextView widget doesn't have place-holding spaces at every possible X Y location that can be replaced with letters. If X, Y has never been printed to before, I have to output leading spaces up to that position.
Or, I could initialize the TextView widget with spaces, which might make things immensely easier.
2007-01-03
This week has been slow at work so it's time to work on my own projects. Knox comes to mind, but I'm still in the same place I've been for a while: in need of better C libraries to utilize to speed development. The problem is that I can't find libraries that I'm pleased with. I have no unicode capable string or input abstraction libraries to speak of.
