Wednesday, August 18, 2010

Birth of a new Distro, Part 2: The experiment, Jet-Fu

As stated in a previous post, dependencies are getting out of hand.  I like to try out different Linux distributions in Virtualbox.  Even some of the so called 'lite' versions seem to take up over 1GB of hard drive space.  Some distributions try to stay ultra lean, like Puppy, DSL, Tinycore, etc.  But they all seem to be so minimalistic that only root user is available, or only a finite number of packages can be installed.  But, as in the case of DSL, are just outdated.

(As a side note, the current version of Puppy, 5.x, is now based on Ubuntu.  I have a Pupplet based on the 4.x series installed on an old Dell laptop with a 166mhz Pentium processor.  It runs terrific.  But the latest Puppy seems to carry the "extra's" brought over from Ubuntu.  It seems slow and sluggish.  In my  experience, I would not attempt to used the latest Puppy version on hardware older than a Pentium 3.  And, don't take that statement as a bash against Ubuntu.  I have Ubuntu 10.04 installed on my home and work PCs.  Ok, now back to the topic)

While testing Tinycore, I found even it is not immune from the bite of Dependency Overload (DO, see this post: http://wsrogers.blogspot.com/2010/07/birth-of-new-linux-distro-part-1.html ).  I figured Tinycore would make for a great base for a minimalist server.  A simple LAMP server should not need all the bloat.  While installing MySQL, one of the listed dependencies was PHP5.  Now, installing PHP 5 was on my todo list, but should not be a requirement to install MySQL.  It is not listed as a dependency on the MySQL site, nor on many other distributions.  This was 'bad' number 2.  The first 'bad' was the way Tinycore mounts new packages.  Because of it's method, only a limited number of packages could ever be installed.

A big plus is the user is able to create other users besides root.  Another plus, the whole release is around 10MB.  I recommend Tinycore for anyone looking for a small distribution to use as a base.  The size includes a window manager.  For about 6MB, Tinycore's team also offer Microcore, an X-less version.  Because this blog is not a review of Tiny/Microcore, I will not write an elaborate review, but after trying it out, I am impressed and will give it 3.5 stars out of 5.  Since my need is based on over use of dependencies, I had to remove one star, the other half is for it's rough edges.  But it's fairly new and I see that it's being actively developed.  But this rating is because the distribution failed to fit my purpose, only.

Now that I have described DO, I should describe my 'want' in a minimalistic distribution.  My experience over the course of 14 years as a now and again Linux/BSD user, and the last 5 years as an avid user, has caused me to develop the idea that the OS needs to be an embedded environment on which a user may run applications.  No one else needs to agree with me, I don't ask that of the world.  The belief is mine.  Therefore, I desire a minimalistic distribution (mini-distro) that can have multiple users, besides root.  The mini-distro should have a recent kernel with recent system applications.  I'm not looking for speed as much as memory usage.  (Here, I refer to memory as described by kernel theory, which includes all cpu cache, system RAM, and a non-volatile storage device).  I would prefer the core system stay out of the way of the users applications.  The minimalistic nature of such a system will be speedy by its nature. 

So, I would like to build my own minimalist OS.  The current name is, in the tradition of many open source projects, an acronym, Jet-Fu, Just Enough To FUnction.  It will be the base for some projects I would like to use myself.  The first is a minimalist anti-virus live image for cleaning infected systems, Jet-Fu:AV.  I won't get on my soapbox to point fingers, we are aware.

This project would be either optical disk, such as CD or DVD (such a waste of space, but they're cheap), usb, or just a tiny partition on the hard drive.  Jet-Fu: AV will boot up, hunt for viri, and clean them in some way.  The advantage is that the viri would not have a chance to put themselves in system RAM and escape detection.  The project is not fully described at this time, so automation, menu driven, or gui have not been decided.

Another project is for a LAMP server.  Although Apache is not necessarily the chosen server, lighter ones exist.  The database doesn't necessarily have to be MySQL.  I have been looking at some Document-Oriented Databases, such as MongoDB or CouchDB (google is our friend, use it).  The backend isn't necessarily going to be PHP.  While I do like PHP, I also like Python.  Not to mention some Java/Tomcat action could be possible.  Or even different flavors of this server version.  Only time will tell.  (My time that I'm willing and able to contribute to these projects)

In another blog, I will describe my new infatuation for YAML and JSON.  I will also write about how they will be tied in with my love for Python and how all three will be the basis for my experimental Jet-Fu project.

Also, some may question why I am using GNU/Linux as the base for Jet-Fu and not some other operating system (OS), such as FreeBSD, OpenBSD, NetBSD, Darwin, OpenDarwin, OpenSolaris, or some other OS.  Well, let me say, they are not ruled out, and I'm actively investigating their use.  Remember, Jet-Fu is just enough.  The name doesn't tie it to a specific OS or kernel. (Yes, XNU, I'm looking at you).  Stay tuned for Part 3, coming at a time I feel like writing.

Saturday, July 24, 2010

Why Python?

Over the last few years I have been looking at various languages.  Many have great strengths, some have great weaknesses, and some are just right.  Last year I decided to start learning some Python, so I searched for a good book.  What I found was an excellent beginners book here.  Not only is the book a great reference, it's open source.  Anyone is allowed to modify the book and redistribute.  Actually, the author of the Python version did just that.  The original book was written by a teacher to teach Java. Allen Downey, Jeff Elkner, and Chris Meyers worked Open Source magic on words, eventually publishing "How to Think Like a Computer Scientist: Learning with Python."  I highly recommend this book as a gentile guide from other languages.

I believe Python should be a first language to introduce budding hackers to the world of coding.  Several universities teach a C-type syntax language as an intro.  Not that C-style is bad, I happen to like curly braces, but beginners always have a hard time with the approximate five lines of code needed to print "Hello World" on a console.  Beginners begin by learning debugging before anything is printed to the screen.  What to import or include, where to put the first curly brace, when to use main(), when to use semi-colons, and much more.  Python allows a beginner to focus on the logic of the code, not the syntax, except for one caveat, whitespace.  Sometimes curly braces are a friend, especially when dealing with blocks of code.  But lets forget about this tiny flaw. *smile*

Beginners can learn to use logic and work with flow control without the headache of the five line overhead.  What five lines?

#include  
 
 
int main()
 
{ 
std::cout << "Hello, world!\n"; 
}


I guess everything could be put on one or two lines, but I'm for readability.  The above code is C++ whereas Python would simply be:

print("Hello, world!")

This example is using the new print from Python 3, for earlier versions, simply replace the parenthesise with spaces.

Once the beginner learns standard logic flow controls, then introduce them to the C-style languages.  I began learning at the university with C#, then learned some Java, and took two semesters of COBOL.  Although, I began writing code many years ago in a dead language called GW-Basic.  I just didn't follow the path of the hacker until later. 

Ease of learning and use are the first reasons 'Why Python?', some other reasons: the language works on most every environment, is free to download and use, has a huge repository, know as the internet, on hand for code libraries and books.  This blog post is simple.  I'm not here to change anyone's mind.  But this post will lead more posts about Python and some uses I'm planning.  Next I plan to write some about YAML (Yet, Another Markup Language) and JSON, and why XML fails at marketing.

P.S.  I forgot to mention, I'm a huge fan of readable code and variables that clearly state their objective.  Readability is one reason I didn't choose from a host of other coding options.

Saturday, July 17, 2010

Birth of a new Distro, Part 1: Dependency Overload or NeoDependency Hell

I am by no means an expert on operating systems. But I do use them.  I have used various systems, from Windows, to GNU/Linux, and to the three major flavors of BSD.  I have used many distributions based on the major players in the Linux world, and the few flavors of distributions based on the *BSDs.  Each one has positives and negatives.  And, like a good tool box, each works well when it's implemented where it works best. 

Recently, though, I've been playing with the distro (distribution) that I use daily, Ubuntu.  I noticed while trying to remove apps (software applications and, here, libraries) I don't use, the uninstaller wants to remove either my desktop environment (de) gnome, or wants to remove other apps that I use often.  This  issue is common across many distros of Linux, at the least.  I am unsure about *BSDs, as I did not notice at the time I operated those systems.

This is a new dependency hell.  At one time, dll hell and rpm hell were well known.  In those instances, different apps needed different versions of a library while on the same operating system.  Now, the hell is removing these apps, which now want to also remove major parts of the system.  Along with this dependency removal hell is the dependency overload on installation.  Many applications bring in large amounts of dependecies (and, in Debian, even recommended apps).  The term should be dependency Overloading (DO) or NeoDependency Hell (NDH)

Dependency Overloading seems to occur more often in distributions that focus on binary packages.  Compile based systems would only need whatever software is required to compile a package.  These systems require a local build environment, which sometimes requires as much space as NDH binaries.

The compiled system would also build packages according to the local system, instead of vanilla packages meant to run on various hardware.

Again, I am no expert.  Even though I have ran compiled based distros, such as Gentoo, I ran them in a time that I barely understood the system, so I cannot fully comment on their installation or removal processes.

Now that the Kernel can ID much of the hardware, and, with the help of scripts, can dynamically load modules per system, it is time to leverage this detection mechanism with one of the greatest ideas release into the open source community, the Live-CD/DVD.  The marriage of a live environment not actually running on the host systems hard drives (hdd) and the great detection mechanisms built into the kernel, a compiled system could be built lean, installed clean.

A newly installed system via the live compile method would require less hdd space.  Since the introduction of large gigabyte drives, the mantra has been, "why worry, we have the space," which also goes along with the RAM manra of, "if memory is too small, we can cheaply add more."

These two mantras are attributing to the bloating systems of today.  Data should be the focus of hdd space use, not the operating system.  No more ram than necessary should be used, whether the requirement is 8MB or 8GB.

I was lead down this road while simply trying to install a small LAMP web server that would only serve a few pages.  But some of the simplest solutions required nearly 2GB hdd space, yet only needed about 80MB of RAM.  Since my server would be dedicated to doing one thing, not being a desktop requiring uninterrupted multimedia playback, why did it require nearly 2GB hdd space?

I searched for an implementation that could fit into less than a few hundred MB on a hdd.  I found TinyCore and a Puplet (a GNU/Linux Puppy distribution community derivative) conveniently called, WebserverPuppy.  While TinyCore did not have any type of web server installed by default, the entire OS is a less than 11MB iso(CD/DVD image) file.  WebserverPuppy requires under 75MB.  The issue I have with the Puppy version is something that Puppy has always used, root only.

I would rather my server not be run in root mode.  TinyCore (TC) has both user and root modes.  I also found that binaries installed on TC required minimal dependencies.  But one that bothered me was a dependency of MySQL, PHP.  While I intended on installing PHP, having it as a dependency of MySQL didn't settle well with me.  MySQL does not require PHP to function, so such a dependency is NDH to others that don't require PHP.  Now, I have to question every application available for install on TC, does it have NDH

Inspired by these two light distributions, I set to find a distribution that does not have DO.  None have caught my eye.  So, if you know of a distribution that has excellent hardware detection in the Live-CD/DVD environment, installs light, and has no Dependency Overloading, let me know.

Otherwise, I am beginning work on a new project that at least will be able to successfully detect all hardware in a live environment, compile or install binaries per the detected hardware, yet be light on the hdd and in RAM.  At this time, this experimental project is called Jet-Fu, Just Enough To FUnction.

Whatever functionality is required, this distribution should only fulfil the requirement, no more, no less.