Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
PhotoKB Home
Discussion Groups
Digital Photography
Digital PhotoDSLR CamerasZLR CamerasPoint & Shoot Cameras
Film Photography
35 mmLarge FormatMedium formatDarkroomFilm and LabsOther Equipment
Photo Technique
Nature PhotographyPeople PhotographyTechnique General
General Photo Topics
General TopicsAustralian PhotographyUK Photography
DirectoryPhoto Clubs

Photo Forum / General Photo Topics / UK Photography / August 2005

Tip: Looking for answers? Try searching our database.

Indexing and searching huge volumes of images????

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Umgall - 01 Aug 2005 14:30 GMT
I know that someone out there can help me with a problem.

I'm being asked to find some software which can index and allow searches on
huge volumes of images.  Most of these images will be TIFFs, and to be
honest, I'm expecting there to be about 1.5 million at the end of the
project.  Ouch.

Basically I need to be able to store 'metadata' against each image, and to
search this metadata very quickly.  Ideally, the metadata would be stored in
an SQL database, and would provide hyperlinks to images on the file system.
I need to have a description (up to 2k of text), a date and a location.

So, I could search for "hyde park" and if this phrase occurs within the
metadata fields of any of the 1.5 million images, the hits would be
displayed (along with the metadata) and I could click through to the image.

Does anyone know of any system that can do what I need?  Any help would be
gratefully appreciated.  The alternative is to develop an application, but
if there is an off-the-shelf solution then this obviously going to be
better!

Umgall.
Michael Cargill - 01 Aug 2005 15:35 GMT
> I know that someone out there can help me with a problem.
>
> I'm being asked to find some software which can index and allow searches on
> huge volumes of images.  Most of these images will be TIFFs, and to be
> honest, I'm expecting there to be about 1.5 million at the end of the
> project.  Ouch.

I can't answer the question but you might want to also ask this on some more
technical newsgroups - perhaps something like alt.comp.databases...?
Gordon Hudson - 01 Aug 2005 16:21 GMT
>> I know that someone out there can help me with a problem.
>>
[quoted text clipped - 7 lines]
> more
> technical newsgroups - perhaps something like alt.comp.databases...?

You need to work out what you want to get out of the database in the end.
This will set what informaton you need to record and the way it is recorded.
Then you need to look at how you are going to access the data and how many
people will need access to it and when.
This will then affect what database system you end up using.
You also need to look and see if there is a commercially available system as
buying it may be cheaper than developing your own.

It all boils down to the use it will be put to.
If it was for access from multiple locations by multiple users I would be
looking at Oracle or MSSQL.
If its just one person sitting at a computer I would probably not risk MS
Access as I don't know how it handles such large databases. You would need
to get some advive on that from people who have run big databases.
Bandicoot - 04 Aug 2005 19:03 GMT
"Gordon Hudson" <gordon@usenet3.hostroute.co.uk> wrote in message
news:42ee3e09$0$38038
[SNIP]
> If its just one person sitting at a computer I would
> probably not risk MS Access as I don't know how
> it handles such large databases. You would need
> to get some advive on that from people who have
> run big databases.

Last time I had anything to do with this sort of thing we wouldn't put a
base that size in Access.  Even at 250,000 items we'd prefer SQL.  Big
databases we mostly used SQL or a Terradata solution (sometimes with a
Natural Language front end).

These were not usually on PC architectures, though they would be accessed
via PCs: Auspex was a good choice if not putting them on the mainframes.
One DB that was used by only about four people, but which was big, went on
an old SGI box attached to the office network, which worked well and was
cheap.

Peter
Willy Eckerslyke - 01 Aug 2005 16:26 GMT
> I'm being asked to find some software which can index and allow searches on
> huge volumes of images.  Most of these images will be TIFFs, and to be
> honest, I'm expecting there to be about 1.5 million at the end of the
> project.  Ouch.

Have a look at ThumbsPlus:
http://www.cerious.com/image-database.shtml
There's a mention of sql in there somewhere.

> Basically I need to be able to store 'metadata' against each image, and to
> search this metadata very quickly.  Ideally, the metadata would be stored in
> an SQL database, and would provide hyperlinks to images on the file system.
> I need to have a description (up to 2k of text), a date and a location.

I'm in the middle of setting up a database driven website along similar
lines (though I doubt if it's ever get past tens of thousands of
images!) using mysql and php. Basic searches are pretty straightforward,
but of course I want it to do more - show categories which can be
clicked on to refine the search to fewer and fewer images, similar to
Ebay's searching system. I should have it finished in a while...
Roger Whitehead - 01 Aug 2005 16:41 GMT
> Basically I need to be able to store 'metadata' against each image

The IPTC standard was created for this very purpose - see
http://www.peterkrogh.com/Pages/digital/iptc.html. What you need therefore
is some software that will let you add it to an image and then some
(perhaps the same) that will let you search it all.

There are several products that will do with or both and they need not be
expensive. IrfanView, for example, lets you add IPTC data - and that's
free. This product (which I’ve not tried) does both jobs -
http://peccatte.karefil.com/Kalimages/EN/Index.html . Another, possiblly
more robust, is here - http://www.camerabits.com/pages/PM4.html .

Perhaps other people here know of some. Most pro photographers need
something like it. There's certainly no need to reinvent the wheel and
start messing around with SQL.

Signature

Roger

Willy Eckerslyke - 02 Aug 2005 10:01 GMT
> Perhaps other people here know of some. Most pro photographers need
> something like it. There's certainly no need to reinvent the wheel and
> start messing around with SQL.

I agree in principle, but as the OP refered to millions of images,
there's going to be a massive investment in time just inputting the
data. In comparison, a few days spent messing with SQL to get something
that does this specific job perfectly, and nothing else, could be time
well spent.
Fine if an off-the-shelf product will work with no compromises, but if
that product doesn't quite fit the requirements or is a bit clunky in
its application - even if it just means an extra mouse click or two -
any small irritation multiplied by 1.5 million is likely to end up as a
major headache.
Roger Whitehead - 02 Aug 2005 10:45 GMT
> I agree in principle, but as the OP refered to millions of images,
> there's going to be a massive investment in time just inputting the
> data. In comparison, a few days spent messing with SQL to get something
> that does this specific job perfectly, and nothing else, could be time
> well spent.

How is that going to speed data inputting? If it doesn't exist in
machine-readble form (and Umgall hasn't said it does), entering it to a
database form is going to be no quicker than entering to a purpose-made
product, possibly the reverse.

Signature

Roger

Willy Eckerslyke - 02 Aug 2005 11:51 GMT
>>I agree in principle, but as the OP refered to millions of images,
>>there's going to be a massive investment in time just inputting the
>>data. In comparison, a few days spent messing with SQL to get something
>>that does this specific job perfectly, and nothing else, could be time
>>well spent.

> How is that going to speed data inputting? If it doesn't exist in
> machine-readble form (and Umgall hasn't said it does), entering it to a
> database form is going to be no quicker than entering to a purpose-made
> product, possibly the reverse.

You don't access the database directly, you write your own form in PHP
that only asks what you want it to and only shows the fields you need.
So instead of a page full of text fields, you may only have one or two
and a submit button. If a field only ever needs to contain one of a
choice of text strings, you can set up your form so that you click on a
radio button to choose one from a list, rather than having to type it in
afresh every time.
If you want, you can tell it to pre-fill the form fields with the last
image's data for you to edit rather than start afresh for every image.
Also you could set it up to bulk fill certain fields if you want it to.

With a little thought, your input form should be _the_ most efficient
way of inputting data. No purpose-made product could ever be as streamlined.
Roger Whitehead - 02 Aug 2005 13:19 GMT
> > entering it to a
> > database form is going to be no quicker than entering to a purpose-made
> > product, possibly the reverse.
>
> You don't access the database directly, you write your own form in PHP

You're splitting hairs now.

> With a little thought, your input form should be _the_ most efficient
> way of inputting data. No purpose-made product could ever be as streamlined.

Unless one has looked at all the significant products, one cannot know. A
sensible buying process would be to do this first, then look into a
roll-your-own answer once one has a basis for comparison.

Signature

Roger

Willy Eckerslyke - 02 Aug 2005 15:09 GMT
>>You don't access the database directly, you write your own form in PHP

> You're splitting hairs now.

Hardly. That's fundamental to the whole thing.

>>With a little thought, your input form should be _the_ most efficient
>>way of inputting data. No purpose-made product could ever be as
[quoted text clipped - 3 lines]
> sensible buying process would be to do this first, then look into a
> roll-your-own answer once one has a basis for comparison.

I have difficulty remembering so far back but I thought that was pretty
much what I suggested in the first place, hence my link to cerious.com.
Roger Whitehead - 02 Aug 2005 15:41 GMT
> > You're splitting hairs now.
>
> Hardly. That's fundamental to the whole thing.

Life's too short to nail your feet to the floor so I'll stop bothering.

> I have difficulty remembering so far back but I thought that was pretty
> much what I suggested in the first place,

Your memory clearly is failing. You suggested one product, not a survey of
them.

Signature

Roger

Willy Eckerslyke - 02 Aug 2005 15:54 GMT
> Life's too short to nail your feet to the floor so I'll stop bothering.

Still have to have the last word though, eh?

>>I have difficulty remembering so far back but I thought that was pretty
>>much what I suggested in the first place,
>
> Your memory clearly is failing. You suggested one product, not a survey of
> them.

Any idea what I had for tea yesterday? I'm trying to decide whether I
need to shop on the way home.
Umgall - 02 Aug 2005 11:57 GMT
>> I agree in principle, but as the OP refered to millions of images,
>> there's going to be a massive investment in time just inputting the
[quoted text clipped - 6 lines]
> database form is going to be no quicker than entering to a purpose-made
> product, possibly the reverse.

I suppose to be fair, the 'metadata' will exist in machine readable form.
This will be generated from an existing database, and if the application
supports it, will be imported in XML.  There aren't many fields, but it is
vital that these can be searched:  County, Date, Description, Surname,
Forename, Placename and image ID.

Willy is right - due to the huge volumes, it's important to get something
which is flexible to allow us to search quickly and return matches, then to
display the image with one keyclick.  Browsing the images is imporant too,
but fast search capabilities are vital.

Thanks for the suggestions so far!

Umgall.
Neil Barker - 02 Aug 2005 12:37 GMT
> I suppose to be fair, the 'metadata' will exist in machine readable form.
> This will be generated from an existing database, and if the application
> supports it, will be imported in XML.  There aren't many fields, but it is
> vital that these can be searched:  County, Date, Description, Surname,
> Forename, Placename and image ID.

I tell you - you want Fotostation Pro - does all that straight out of
the box :-)

Signature

Neil Barker

Phil Kyle - 20 Aug 2005 16:43 GMT
>> I suppose to be fair, the 'metadata' will exist in machine readable
>> form. This will be generated from an existing database, and if the
[quoted text clipped - 4 lines]
> I tell you - you want Fotostation Pro - does all that straight out of
> the box :-)

Closest you've ever been to a box.

Signature

Phil Kyle™  
Uno
Dos
Tres
Cuatro
CINCO!!!!!!

"Be very aware that my willingness
to continue to criticise your sig
is infinite." -- Neil Barker

ah - 21 Aug 2005 01:57 GMT
>>> I suppose to be fair, the 'metadata' will exist in machine readable
>>> form. This will be generated from an existing database, and if the
[quoted text clipped - 6 lines]
>
> Closest you've ever been to a box.

Is that a euphamism?
Signature

ah fait loucher un bon oeil

Neil Barker - 01 Aug 2005 17:42 GMT
> I know that someone out there can help me with a problem.
>
[quoted text clipped - 7 lines]
> an SQL database, and would provide hyperlinks to images on the file system.
> I need to have a description (up to 2k of text), a date and a location.

Yup, no problemo.

Have a look at Fotostation Pro and Index Manager.

http://www.fotoware.com

Fotostation Pro is the front-end application, which works as a
standalone image cataloguer / editor, but really comes into its own
when connected to a server running Index Manager.

Essentially what happens is this:-

When an image is sent to the server from Fotostation Pro, Index Manager
reads the data contained in the IPTC fields and adds it to an index,
with a pointer to that image file location for later retrieval.

When using the search facility in Fotostation, rather than having to
search through thousands of files, all it needs to do is to consult the
master index - any matches can then be found in seconds.

You'll find that many newspapers, mine included, run this system and it
does work extremely well. We currently have just under 100,000 images
online and searching on a keyword or phrase takes literally a few
seconds. Index Manager has the capacity to search millions of images,
potentially spread over several servers using something called "Cluster
Commander" (which enables many servers to be treated effectively as one
big one). It can also do Boolean algebra searches using AND/OR/NOT
together with phonetic searches and more.

It can also be connected to a WWW front-end, which is a Java
application enabling online viewing/ordering etc.

If you need further help with this, feel free to get in touch.

Signature

Neil Barker

Phil Kyle - 20 Aug 2005 16:43 GMT
>> I know that someone out there can help me with a problem.
>>
[quoted text clipped - 42 lines]
>
> If you need further help with this, feel free to get in touch.

He means that literally.

Signature

Phil Kyle™  
Uno
Dos
Tres
Cuatro
CINCO!!!!!!

"Be very aware that my willingness
to continue to criticise your sig
is infinite." -- Neil Barker

ah - 21 Aug 2005 01:56 GMT
>>> I know that someone out there can help me with a problem.
>>>
[quoted text clipped - 44 lines]
>
> He means that literally.

Oooohhhh..
Signature

ah fait loucher un bon oeil

infinity - 02 Aug 2005 00:20 GMT
> I know that someone out there can help me with a problem.
>
[quoted text clipped - 6 lines]
> and to search this metadata very quickly.  Ideally, the metadata would
> be stored in an SQL database,

Have a look at Thumbsplus 7 by Cerious, which uses the Access database
format, although it functions as a standalone application. You can have
keywords, user defined fields that take numeric or string values, and also
add lengthy comments to images. The thumbnails view can show all your own
fields & keywords plus EXIF data etc, and info embedded in the file can be
used to generate keywords if you like, as can its name and folder path.
There's a 30 day free trial available. Since the database is now Access
format, you should be able to open it directly if you need more
functionality and use your own search macros.
I'm not sure how well it copes with millions of images but certainly tens
of thousands is no problemo.
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.