• How?

    From Angus McLeod@VERT/ANJO to All on Sunday, September 10, 2006 21:43:00
    Have a look at this photo:

    http://hadar.cira.colostate.edu/ramsdis/online/data/rmtcrso/457.jpg

    I'm grabbing these with LWP::Simple using get() or getstore(), no problem.

    Examine the strip at the bottom of the photo. See the text data at the bottom? I nead to read some of this. In particular, I need to read the
    part which says "10 SEP 06" and the part which says "1645". This so I can timestamp the image at "2006-09-10 16:45". I need to do this as a part of
    the process that fetches the image, so it can be stuck in the database
    with the correct timestamp.

    Anyone have any ideas on how I might post-process the image after
    retrieval, to recover the timestamp info I need?
    ---
    Playing: "Your mirror" by "Simply Red" from the "Stars" album
    þ Synchronet þ Programatically generated on The ANJO BBS
  • From Digital Noise@VERT to Angus McLeod on Monday, September 11, 2006 00:57:35
    Only way I know of to recover text data from a photograph (being a
    binary medium) would be if the photo had EXIF (sp?) data encoded -
    typically placed by digital cameras and contains Camera Make/Model;
    Fstop; Shutter Speed; Date/time, etc.

    Other than that, I'm not sure how you would obtain text data from a
    binary object such as a photograph through automatic means...

    DN

    Angus McLeod wrote:
    Have a look at this photo:

    http://hadar.cira.colostate.edu/ramsdis/online/data/rmtcrso/457.jpg

    I'm grabbing these with LWP::Simple using get() or getstore(), no problem.

    Examine the strip at the bottom of the photo. See the text data at the bottom? I nead to read some of this. In particular, I need to read the part which says "10 SEP 06" and the part which says "1645". This so I can timestamp the image at "2006-09-10 16:45". I need to do this as a part of the process that fetches the image, so it can be stuck in the database
    with the correct timestamp.

    Anyone have any ideas on how I might post-process the image after
    retrieval, to recover the timestamp info I need?

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Deuce@VERT/SYNCNIX to Angus McLeod on Monday, September 11, 2006 11:30:00
    Re: How?
    By: Angus McLeod to All on Sun Sep 10 2006 09:43 pm

    Anyone have any ideas on how I might post-process the image after
    retrieval, to recover the timestamp info I need?

    Using imagemagick to grab the bottom bit and save as a pbm then running it through ocrad (http://www.gnu.org/software/ocrad/ocrad.html) should do the trick I would think.

    ---
    Wheeble.

    ---
    þ Synchronet þ My Brand-New BBS (All the cool SysOps run STOCK!)
  • From Deuce@VERT/SYNCNIX to Angus McLeod on Monday, September 11, 2006 11:36:00
    Re: How?
    By: Deuce to Angus McLeod on Mon Sep 11 2006 11:30 am

    Anyone have any ideas on how I might post-process the image after retrieval, to recover the timestamp info I need?

    Using imagemagick to grab the bottom bit and save as a pbm then running it through ocrad (http://www.gnu.org/software/ocrad/ocrad.html) should do the trick I would think.

    Actually, in looking at it, it appears that the timestamp info is already pretty darn close to correct. depending on the timezone it's in.

    A HEAD request should give you this info in the Last-Modified header.

    example:
    telnet hadar.cira.colostate.edu http
    Trying 129.82.108.81...
    Connected to hadar.cira.colostate.edu.
    Escape character is '^]'.
    HEAD /ramsdis/online/data/rmtcrso/457.jpg HTTP/1.0
    Host: hadar.cira.colostate.edu

    HTTP/1.1 200 OK
    Server: Microsoft-IIS/5.0
    Date: Mon, 11 Sep 2006 17:33:46 GMT
    Content-Type: image/jpeg
    Accept-Ranges: bytes
    Last-Modified: Mon, 11 Sep 2006 12:37:35 GMT
    ETag: "f8acff9fd5c61:a14"
    Content-Length: 84393

    Connection closed by foreign host.

    Hrm... I just talked about a head request with a straight face.

    ---
    Wheeble.

    ---
    þ Synchronet þ My Brand-New BBS (All the cool SysOps run STOCK!)
  • From Angus McLeod@VERT/ANJO to Digital Noise on Monday, September 11, 2006 14:22:00
    Re: Re: How?
    By: Digital Noise to Angus McLeod on Mon Sep 11 2006 00:57:00

    Only way I know of to recover text data from a photograph (being a
    binary medium) would be if the photo had EXIF (sp?) data encoded -
    typically placed by digital cameras and contains Camera Make/Model;
    Fstop; Shutter Speed; Date/time, etc.

    I've already examined the file for any embedded date (in a comment or elsewhere). The trouble is the image is taken at the time shown but the
    JPEG file could be produce some time later, so I can't for example, ask
    the webserver for the file date or anything like that.

    Other than that, I'm not sure how you would obtain text data from a
    binary object such as a photograph through automatic means...

    Well, I could maybe cut up the image and do a sort of crude pattern-match against chunks previously cut out and manually identified, but I don't
    know exactly how effective that would be....
    ---
    Playing: "Groundswell" by "Emma Townshend"
    from the "Winterland" album
    þ Synchronet þ Programatically generated on The ANJO BBS
  • From Angus McLeod@VERT/ANJO to Deuce on Monday, September 11, 2006 18:10:00
    Re: How?
    By: Deuce to Angus McLeod on Mon Sep 11 2006 11:30:00

    Using imagemagick to grab the bottom bit and save as a pbm then running it through ocrad (http://www.gnu.org/software/ocrad/ocrad.html) should do the trick I would think.

    Dunno about ocrad, but this:

    -----------8<-----------------

    #!/usr/bin/perl -w

    use Image::Magick;

    %digit = (
    "00000000000000" => 0, # blank space = 0
    "be6030990c067d" => 0,
    "0806028140207c" => 1,
    "be2010c41104fe" => 2,
    "7f10040302067d" => 3,
    "20188a24fa8340" => 4,
    "ff40e00704067d" => 5,
    "3c41a0370c067d" => 6,
    "7f200882200802" => 7,
    "1c9188230a067d" => 8,
    "be6030ec05823c" => 9,
    );

    # filling in the other months will take a YEAR!!!!
    %month = (
    "3efefc820408060810f4f1e3072440004880409000011f7f02" => 9, # SEP );

    my $ink;

    sub setink {
    my $img = shift;
    my $x = shift;
    my $y = shift;

    my @pixel = $p->GetPixels(
    width => 1,
    height => 1,
    x => $x,
    y => $y,
    map => 'RGB',
    normalize => 0
    );

    $paper = $pixel[0];
    if ($paper == 0) {
    $ink = 65535;
    } else {
    $ink = 0;
    }
    }

    sub gettext {
    my $img = shift;
    my $x = shift;
    my $y = shift;
    my $w = shift || 7;

    my @pixel = $p->GetPixels(
    width => $w,
    height => 8,
    x => $x,
    y => $y,
    map => 'RGB',
    normalize => 0
    );

    my $bitstring = "";
    my $offset = 0;
    for ($i=0; $i<=$#pixel; $i+=3) {
    if ($pixel[$i] == $ink) {
    vec( $bitstring, $offset++, 1 ) = 1;
    #print "#";
    } else {
    vec( $bitstring, $offset++, 1 ) = 0;
    #print " ";
    }
    #if (($offset % $w) == 0) { print "\n" }
    }
    $bitstring =~ s/(.|\n)/sprintf("%02lx", ord $1)/eg;
    return $bitstring;
    }

    $file = $ARGV[0];

    $p = new Image::Magick();
    Read( $file );
    Posterize( levels => 2, dither => 0 );
    setink( $p, 150, 470 );

    $d1 = gettext( $p, 189, 469 );
    $d2 = gettext( $p, 198, 469 );
    $m = gettext( $p, 216, 469,25 );
    $y1 = gettext( $p, 252, 469 );
    $y2 = gettext( $p, 261, 469 );

    if ((defined $digit{$d1}) && (defined $digit{$d2})
    && (defined $month{$m})
    && (defined $digit{$y1}) && (defined $digit{$y2})) {

    $yyyymmdd = sprintf "%04d-%02d-%02d",
    (($digit{$y1} * 10) + $digit{$y2}) + 2000,
    $month{$m},
    (($digit{$d1} * 10) + $digit{$d2});
    } else {
    $yyyymmdd = "????-??-??";
    }

    $h1 = gettext( $p, 306, 469 );
    $h2 = gettext( $p, 315, 469 );
    $m1 = gettext( $p, 324, 469 );
    $m2 = gettext( $p, 333, 469 );

    if ((defined $digit{$h1}) && (defined $digit{$h2})
    && (defined $digit{$m1}) && (defined $digit{$m2})) {

    $hhmm = sprintf "%02d:%02d",
    (($digit{$h1} * 10) + $digit{$h2}),
    (($digit{$m1} * 10) + $digit{$m2}),
    } else {
    $hhmm = "??:??";
    }


    #foreach $key (keys %digit) {
    # print "$key = $digit{$key}\n";
    #}

    printf "%s %s\n", $yyyymmdd, $hhmm;

    -------------------->8----------------------

    works SOME of the time. For instance, it works on

    http://hadar.cira.colostate.edu/ramsdis/online/data/rmtcrso/502.jpg

    which is thermal infrared, but not on

    http://hadar.cira.colostate.edu/ramsdis/online/data/rmtcrso/483.jpg

    which is short-wave IR. Note the cyan ink on white paper! My attempt to
    beat the problem with setink() failed, because the characters themselves
    are distorted. (Uncomment print statements in gettext() to see.) I can't help feeling that some combination of ImageMagick functions ought to be
    able to clean up the image and make reading the data reliable, but I don't know ImageMagick all that well.

    Fortunately, the short-wave IR images are the ugliest, and I'm even
    thinking of dropping them from consideration altogether, and stick with
    VIS and IR4 only.

    <ponders>

    ---
    Playing: "When tomorrow comes" by "Eurythmics"
    from the "Greatest hits" album
    þ Synchronet þ Programatically generated on The ANJO BBS
  • From Angus McLeod@VERT/ANJO to Deuce on Monday, September 11, 2006 18:24:00
    Re: How?
    By: Deuce to Angus McLeod on Mon Sep 11 2006 11:36:00

    Actually, in looking at it, it appears that the timestamp info is already pretty darn close to correct. depending on the timezone it's in.

    A HEAD request should give you this info in the Last-Modified header.

    I tried

    perl -MLWP::Simple -e '@z=head("http://hadar.cira.colostate.edu/"
    . "ramsdis/online/data/rmtcrso/502.jpg");
    print scalar gmtime($z[2]), "\n";'

    and got

    Mon Sep 11 21:38:23 2006

    for an image that is actually dated Mon Sep 11 21:15. This might not be
    too bad. The images seem to all be ??:15 or ??:45 so I could roll back to
    the most recent 15/45 minute spot and go with that. I'd *much* rather
    read it off the image (see earlier post) but if that can't work...

    Hrm... I just talked about a head request with a straight face.

    You are working too hard.
    ---
    Playing: "Would I lie to you?" by "Eurythmics"
    from the "Greatest hits" album
    þ Synchronet þ Programatically generated on The ANJO BBS
  • From Digital Man@VERT to Angus McLeod on Monday, September 11, 2006 15:23:52
    Re: How?
    By: Angus McLeod to All on Sun Sep 10 2006 09:43 pm

    Have a look at this photo:

    http://hadar.cira.colostate.edu/ramsdis/online/data/rmtcrso/457.jpg

    I'm grabbing these with LWP::Simple using get() or getstore(), no problem.

    Examine the strip at the bottom of the photo. See the text data at the bottom? I nead to read some of this. In particular, I need to read the part which says "10 SEP 06" and the part which says "1645". This so I can timestamp the image at "2006-09-10 16:45". I need to do this as a part of the process that fetches the image, so it can be stuck in the database
    with the correct timestamp.

    Anyone have any ideas on how I might post-process the image after
    retrieval, to recover the timestamp info I need?

    To actually parse the text from the image data, you'll need to use OCR. Yuck. It'd be better if you some meta data somewhere (in the JPEG header perhaps?).

    digital man

    Snapple "Real Fact" #116:
    The largest fish is the whale shark - it can be over 50 feet long and weigh 2 tons.
    Norco, CA WX: 110.1øF, 22% humidity, 1 mph SSE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Belly@VERT/BRAZINET to Angus McLeod on Monday, September 11, 2006 21:00:00
    Re: How?
    By: Angus McLeod to All on Sun Sep 10 2006 10:43 pm

    I'm grabbing these with LWP::Simple using get() or getstore(), no problem.

    Examine the strip at the bottom of the photo. See the text data at the bottom? I nead to read some of this. In particular, I need to read the part which says "10 SEP 06" and the part which says "1645". This so I can timestamp the image at "2006-09-10 16:45". I need to do this as a part of the process that fetches the image, so it can be stuck in the database
    with the correct timestamp.

    Anyone have any ideas on how I might post-process the image after retrieval, to recover the timestamp info I need?

    It's a shame that they didn't store the stuff in the jpeg metadata also. That would have made it too easy.

    My suggestion:

    Make a copy of the image with all but the text cropped out... You could use ImageMagick for this:
    http://www.imagemagick.org/script/perl-magick.php

    and then try this:

    http://search.cpan.org/~jmastros/OCR-PerfectCR-0.03/lib/OCR/PerfectCR.pm



    o
    (O)
    BeLLy


    ---
    þ Synchronet
  • From Belly@VERT/BRAZINET to Deuce on Monday, September 11, 2006 21:02:00
    Re: How?
    By: Deuce to Angus McLeod on Mon Sep 11 2006 12:36 pm

    Hrm... I just talked about a head request with a straight face.

    hee hee


    o
    (O)
    BeLLy


    ---
    þ Synchronet