Archive for the ‘Uncategorized’ Category

a common css mistake

Friday, May 14th, 2010

I'm playing with febootstrap this evening, part of a continuing migration away from Ubuntu toward Debian and Fedora. I googled my way to Rich's febootstrap page. On my browser, it looks like this:

The problem is that the page's css has the following rules:

h1, h2, h3, h4 {
  color: #333;
}

pre {
  background-color: #fcfcfc;
}

Both these rules assume the usual black-on-white color scheme, but I'm using a gtk theme with a dark background and light foreground, which firefox respects. (Chromium doesn't have the issue because it enforces black-on-white defaults regardless of the gtk theme. I'm not sure how I feel about that; it fixes the problem but at the expense of ignoring my theme.)

This isn't to say anything bad about febootstrap, of course. I'm thrilled that somebody has finally written for Fedora what the excellent debootstrap has been providing for Debian and Ubuntu users for years! But I've come across this mistake enough times that Rich's site drew the unlucky number. ;-)

“git along little dogies…”

Friday, April 23rd, 2010

At my place of work we're moving our repo management from gitosis to an evaluation installation of  GitHub:FI. After spending a while searching around for moving, backing up and transferring  git repos I was unable to find a good example. I had to dig through the git manpages for quite a while to figure out what I wanted. The solution is quite simple but it was not obvious what combination of git clone/fetch/push/pull or other commands was appropriate. Now, this may be due to my own stupidity but even when I had the solution in hand I was still unable to find pages which showed how fully copy a repo from one remote location to a new remote location. I'm posting what I found here for posterity.

Movin' right along...

The basic process of moving a repo including all branches and tags is as follows:

git clone --mirror git@gitosis:my-repo.git
git --bare --git-dir my-repo.git push --mirror git@githubfi:my-user/my-repo.git

The above assumes that the git@githubfi:my-user/my-repo.git repo was created as an empty repo some point before the last command was run.

With the default config settings, git clone will create remote tracking branches for all branches found in the remote repo but will only create a local branch for the repository's currently active branch. In order to fully mirror the repository and all references (branches and tags) locally, you need to use the --mirror option which will also create the repo as a bare repository.

Rawhide!

A normal working copy has, at the top level, your checked out files and a .git/ sub-directory. A bare repository omits the working copy and has the contents of the .git/ directory at the top level. Bare repositories are intended to be used remotely (such as by a repository management system) and, by default, are named with a .git suffix to distinguish them as seen in the repo URLs above. (e.g. my-repo.git)

Since a bare repo has no working copy to sit in, we need to tell git where and how to find it when we call git commands against the repository, hence the --bare --git-dir my-repo.git options. The git push --mirror assures that all refs (branches/tags) will be pushed to the new location instead the default behaviors of only pushing refs (branches/tags) specified on the command line or pushing the branches that match by name between the local and remote repositories if none are specified on the command line.

More cowbell...

There's one further thing to mention that may be helpful. While transitioning from gitosis to GitHub:FI I took the opportunity to make our pre-receive hook a bit more stringent in its checking of emails, requiring that the committer email be from our corporate domain on non-upstream branches. In order to do this I had to use git filter-branch to do a little cleanup. Because we had a bare repo, the invocation looked a bit different. I'm showing a simplified version below:

git clone --mirror git@gitosis:my-repo.git
git --bare --git-dir my-repo.git filter-branch --commit-filter \
    '[ "$GIT_COMMITTER_EMAIL" = "joe@home.org"] && export GIT_COMMITER_EMAIL="joe@work.com"; \
        git commit-tree "$@"' \
    -- master devel staging v2.{30..45}
git --bare --git-dir my-repo.git push --mirror git@githubfi:my-user/my-repo.git

Note that history rewriting such as done by git filter-branch has implications for any repos that have been cloned out in the wild. Be sure you understand history rewriting before you use this command or there will be pain and sadness among the other developers. In our case, the above changes were done in a manner that was coordinated among the developers involved and was quite pain free. Note also that I did not rewrite the "upstream" branch which did not need to be purged of non-work addresses so I omitted it from the list of refs to be filtered. Further note that tags, since they are refs, also need to be rewritten so I passed our version tags to git filter-branch as well.

daemonizing bash

Tuesday, April 20th, 2010

Before we jump into this, let's be clear about intent: There are better languages for writing daemons than bash. Honestly, any other language is probably a better choice. Writing a daemon implies that you're writing a sufficiently complex program that bash is already the wrong language, with or without daemonization!

But if you find yourself in the unfortunate position of needing to daemonize an existing bash program, and you'd rather put off rewriting it in a more suitable language, keep reading! I found myself in that position recently and kept some notes.

Daemonizing a process consists of two primary tasks: forking to the background to return control to the shell, and preventing undesirable interaction between the process and the host. Rich Stevens enumerated the steps in his classic Advanced Programming in the UNIX Environment. Here's my summary of his formula with implementation notes for bash.

  1. Call fork (to guarantee the child is not a process group leader, necessary for setsid) and have the parent exit (to allow control to return to the shell).

    Forking in bash is a simple matter of putting a command in the background using "&". To put a sequence of commands in the background, use a subshell: "( commands ) &". Note that bash doesn't provide any method for the child process to continue the same execution path as the parent, so the entirety of the child must be contained in the subshell. The easiest way to do this is implement the child as a bash function: "childfunc &".

  2. Call setsid to create a new session so the child has no controlling terminal. This simultaneously prevents the child from gaining access to the controlling terminal (using /dev/tty) and protects the child from signals originating from the controlling terminal (HUP and INT, for example).

    Bash provides no method to call the setsid syscall for the current process. We have two less-than-ideal alternatives:

    1. The util-linux-ng package provides an external setsid command but this daemonizes an external command rather than the currently running script. It also makes collecting the PID of the child tricky because the setsid command will fork internally.

      Having said all that, if your application allows you to use the setsid command, it's a good choice because bash can't otherwise fully protect against the child process opening /dev/tty. It's still a good idea to redirect std* to prevent stray output to the terminal.

    2. Lacking the setsid syscall, there are steps we can take to partially protect the child process from the effects of the controlling terminal:
      1. Redirect std* to files or /dev/null
      2. Guard against HUP and INT by signal handler in child
      3. Guard against HUP by disown -h in parent

      Unfortunately without setsid there is no way to guard completely against a subchild opening /dev/tty until the terminal emulator exits, then /dev/tty will become unavailable.

  3. Change working directory to / to prevent the daemon from holding a mounted filesystem open.

    Bash is good at this. :-)

  4. Set umask to 0 to clear file mode creation mask.

    I have to admit that I can't understand the point of this, in bash or any other language. It seems to me that the child will either set its umask explicitly before creating files, or it will set individual file permissions explicitly, or it will fall back on the caller's umask. In the last case, I want my inherited umask, not the wide-open zero.

    If anybody wants to explain a good reason for step 4, I'm all ears... Until then, it's commented out in my implementation below.

  5. Close unneeded file descriptors.

    This step is fun in bash using eval and brace expansion...

With those notes in-hand, here's my implementation. There are two
functions here, "daemonize" for an external command using setsid,
"daemonize-job" for a function in the running script.

# redirect tty fds to /dev/null
redirect-std() {
    [[ -t 0 ]] && exec </dev/null
    [[ -t 1 ]] && exec >/dev/null
    [[ -t 2 ]] && exec 2>/dev/null
}

# close all non-std* fds
close-fds() {
    eval exec {3..255}\>\&-
}

# full daemonization of external command with setsid
daemonize() {
    (                   # 1. fork
        redirect-std    # 2.1. redirect stdin/stdout/stderr before setsid
        cd /            # 3. ensure cwd isn't a mounted fs
        # umask 0       # 4. umask (leave this to caller)
        close-fds       # 5. close unneeded fds
        exec setsid "$@"
    ) &
}

# daemonize without setsid, keeps the child in the jobs table
daemonize-job() {
    (                   # 1. fork
        redirect-std    # 2.2.1. redirect stdin/stdout/stderr
        trap '' 1 2     # 2.2.2. guard against HUP and INT (in child)
        cd /            # 3. ensure cwd isn't a mounted fs
        # umask 0       # 4. umask (leave this to caller)
        close-fds       # 5. close unneeded fds
        if [[ $(type -t "$1") != file ]]; then
            "$@"
        else
            exec "$@"
        fi
    ) &
    disown -h $!       # 2.2.3. guard against HUP (in parent)
}

clock trick

Wednesday, March 10th, 2010

I bought a Sony "Dream Machine" Clock Radio about a year ago and have liked it except for one problem: even when the brightness is set to "low", it's still too bright for my taste. Each night I have to rotate the clock away from me on the nightstand so I can fall asleep.

This morning I applied Insta-Cling Windshield Strip Professional Tint Film to the front and I think it's not bad for $4.88 and a few minutes with an x-acto knife!

Here's the before/after ("before" courtesy of Amazon since I didn't think of taking a photo):

http://agriffis.n01se.net/blog-images/clock-before-after.jpg

SHA1 broken by US Government

Tuesday, July 7th, 2009

...but not in the way you might expect. One of us noisers (Gerry Leach) tried to use the Argonne National Laboratory's implementation of SHA1 to double-check his own computations. What he found was a bit of a surprise, to say the least...

Given the input string 1316798755 the above site returns DA39A3EE5E6B4BD3255BFEF95601890AFD879. One wouldn't normally question this result, coming from a national laboratory, except that it didn't match Gerry's local tests, nor does it match mine:

$ echo -n 1316798755 | sha1sum | tr '[:lower:]' '[:upper:]'
A897C3B9E5A64D609A1D7DB3D1D7F4C03C3F00A1  -

Bob Bell quickly pointed out the similarity of the site's computation to the SHA1 of the empty string:

$ sha1sum </dev/null | tr '[:lower:]' '[:upper:]'
DA39A3EE5E6B4B0D3255BFEF95601890AFD80709  -
DA39A3EE5E6B4BD3255BFEF95601890AFD879 (output from ANL)

After a few more tests, Bob enumerated the issues with the site's computation:

  • First, it strips any non-alpha characters from the input, including digits and whitespace.
  • Then it converts lowercase input to uppercase, so the result for "foobar" is the same as the result for "FOOBAR", but even so the final answer is wrong because...
  • It prints the result as a string of bytes using %X instead of %02X, so the output drops leading zeroes in the hex representation of each byte.

I wonder what Argonne is doing with this particular calculator... Dare we hope for... nothing?

using the reStructuredText plugin for WordPress

Thursday, May 15th, 2008

The first thing I did after migrating content from our old blog was to install the reStructuredText WordPress plugin. The second thing I did was convert a few entries to use it. And the third thing, well, the third thing was to spend a couple hours trying to figure out why it insisted on rendering headings as <h1> instead of <h3> even though I had set $rst2html_options = '--initial-header-level=3 --no-doc-title ...'.

Finally found it! $rst2html_options is global but the rendering function was using an unitialized local value. Here's the patch:

--- rest.php.old 2008-05-15 08:14:52.000000000 -0700
+++ rest.php     2008-05-15 08:15:00.000000000 -0700
@@ -76,6 +76,7 @@
  */
 function reST($text) {
   global $rst2html;
+  global $rst2html_options;
   global $cachedir;
   global $usepipes;
   global $tmpdir;

I'll try to get this upstream, but the launchpad project doesn't provide any contact information and claims it doesn't use a bug database. Best I've been able to figure out so far is to leave a comment here.

sudoku by regex

Thursday, May 3rd, 2007

About a year ago, I proposed a contest on #noise to write a sudoku solver. LIM wrote us a driver and generator. Chouser, Mr_Bones_, and owend wrote solvers. I started one in LISP but never finished it, to my shame...

One question that's always been in my mind is "How could we use a regex to solve a sudoku?" The main problem seems to be the input: Regular expressions are a language for searching, so you need the search space expressed in a way that the regex can process it. The naive approach expresses all the possible puzzles on the input, which is way too much data.

I usually think of a sudoku like this, where zeros are squares that need to be solved:

0 0 0 4 0 0 0 0 0
0 0 0 0 0 2 0 7 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 4 1 0
0 0 9 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 6 7
0 0 0 0 0 0 0 0 0
0 7 6 3 0 0 0 0 0

The problem with this representation is that the zeros aren't searchable. The regular expression can't tell that they mean "any number 1 through 9". So for a first step, my solver replaces them:

$puz =~ s{\d}{$& ? $&.'        ' : '123456789'}ge;

So the massaged input looks like this:


123456789 123456789 123456789 4         123456789 123456789 123456789 123456789 123456789
123456789 123456789 123456789 123456789 123456789 2         123456789 7         123456789
123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789
123456789 123456789 123456789 123456789 123456789 123456789 4         1         123456789
123456789 123456789 9         123456789 123456789 123456789 123456789 123456789 123456789
123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789
123456789 123456789 123456789 123456789 123456789 123456789 123456789 6         7
123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789
123456789 7         6         3         123456789 123456789 123456789 123456789 123456789

Now each cell contains the full list of possibilities for that cell. This is searchable. And with negative lookahead assertions, it doesn't have to waste nearly as much time backtracking.

The final output, after the second substitution, is:

9 8 7 4 6 5 3 2 1
6 5 4 1 3 2 9 7 8
3 2 1 9 8 7 6 5 4
8 6 5 7 9 3 4 1 2
7 4 9 8 2 1 5 3 6
2 1 3 6 5 4 7 8 9
5 3 8 2 4 9 1 6 7
1 9 2 5 7 6 8 4 3
4 7 6 3 1 8 2 9 5

Neat, eh? So finally, here are the two scripts implementing this solution. The first script is the solver itself. It accepts a puzzle on stdin formatted like the input above, converts it internally to the searchable format, and generates on stdout the solved puzzle. The second script generates the regular expression which is inserted into the first script.

By the way, while this isn't the fastest solver out there by any means (it's just brute force, after all), it solves most puzzles in approximately 0.01 seconds. That's 1/10 of the time genre.pl takes to generate the regular expression in the first place.

Placeholder post

Saturday, May 20th, 2006

Blogger requires me to make a post before I can preview the template or subscribe to the RSS feed. Is this fair enough or lame to the max? Discuss...