January 2017

M T W T F S S
      1
2345678
9101112131415
16 171819202122
23242526272829
3031     

Style Credit

Expand Cut Tags

No cut tags
Friday, September 14th, 2007 07:13 pm
Just as an offhand request, does anyone out there know of a simple program for managing and searching simple lists of data? In particular, I'm imagining something where I could easily input a list of items, each with a title, brief description, maybe some other data (ratings?), and arbitrarily many keywords to describe it. I'd want the description text to be searchable, and I'd want to be able to browse lists of items with a given keyword (or, ideally, boolean combination of keywords).

I'm pretty sure that what I'm describing here would be satisfied by practically any database. And indeed, I happen to have installed MySQL on my laptop years ago (any important updates since 4.0.18?). But 1) that seems like massive overkill, and 2) if it would take me more than about five minutes to set it up and learn how to use it, I don't have time: this is for a maybe-useful maybe-vaguely-work-related project that is far, far from essential. (As I recall, my earlier look at MySQL sputtered out largely because the tutorials I saw all suffered from either a glacial pace, an overdose of obscure syntax variants and data models, an obsession with moral purity in relational database design, or all of them at once.) By contrast, I can just look at a program like iTunes and understand how to enter data, search and sort, create "smart playlists", and all sorts of other things.

I've actually considered just abusing the excellent Mac bibliography/research-paper-PDF manager BibDesk to do what I want. If I just pretended each data entry was a scientific paper, I think it could work fine. But given how easily I could adapt it (or maybe even iTunes or iPhoto) to do this, it seems crazy to think that simple general purpose tools like this aren't a dime a dozen. I have no idea how to find a good one, though. Any suggestions?

(I understand that newer versions of MySQL include GUI tools to make things a bit more manageable, but I expect that those would still require me to learn a bunch of database jargon and design principles. I wouldn't mind the option to gradually transition to that level of knowledge, but I can't afford to start that way.)
Saturday, September 15th, 2007 05:03 am (UTC)
To me that sounds like an invitation to code something in Lisp (or maybe Perl); Use the language's native data type for the list, and write the queries in the language. With something like scheme (a sane lisp dialect), the command line interface is sensible.

But I don't think that's quite as user-friendly as you want. Regrettably, I don't know of a proper solution.
Saturday, September 15th, 2007 06:32 pm (UTC)
Given that the sum total of my experience with Lisp has come from occasional attempts to tweak my .emacs file by hand, I don't think that's my best bet for "really easy" at this point. (I have memories of a great many parentheses whose purpose I didn't entirely understand. And " ` " characters, I think. Or were they " ' " characters? Either way, they seemed to have some sort of deep significance.) Perl is more of an option; I'm a bit rusty (even though I've programmed in it more recently than anything else) but I could produce something functional quite quickly. I just know that I'd be reinventing the wheel, though! And if I did it in Perl then I'd be tempted to write a CGI interface for myself, and that could spiral out of control. :)
Saturday, September 15th, 2007 01:17 pm (UTC)
Man, back in the day, I would so have written this in about 15 minutes in HyperCard. ;)

Unfortunately, as you say, I can think of a dozen application-specific programs like this, but I'm not coming up with anything general off the top of my head.
Saturday, September 15th, 2007 06:34 pm (UTC)
:) That's so true! Which CS class was it where you submitted some major project in Hypercard? (I hadn't thought about that in years...)
Saturday, September 15th, 2007 10:20 pm (UTC)
I remember there was one, but I'm not sure which one it was. I know I wrote an Assembly interpreter with visible registers in HyperCard, but I don't remember if that was for Mudd or grad school. I may have done a database thing, and I think there was something else. I really need to transfer my older Mac archives onto this computer before they're lost to the ether forever.
Saturday, September 15th, 2007 05:24 pm (UTC)
what about just using a text file? Simple searches can be done with "grep", and you could implement boolean searchability in 15 minutes in python or perl.
Saturday, September 15th, 2007 06:46 pm (UTC)
This was my thought, too. Simply have text files of lists, keywords are directories containing symlinks, and grep is your friend. At some point a SQL database becomes a good idea, but this is not yet that time.

Alternatively, MySQL and other have some graphical browsers that might make sense, and/or you might want to consider just using some spreadsheet program and saving everything in greppable .CSV format.
Saturday, September 15th, 2007 07:08 pm (UTC)
Interesting: I never even thought of using the file system itself as the data structure. I suppose it's easy to "code"... though I'm not seeing a quick way to do boolean keyword searches off the top of my head. And it seems like data entry would be a pain (how many times do I want to type "ln -s ..."?).
Saturday, September 15th, 2007 08:38 pm (UTC)
Data entry WOULD be kind of a pain. But I was thinking that most of your data entry would be lists, and not keywords. If it is mostly keywords then a different approach is called for. Probably you should use spreadsheet software for input and save in .CSV format. .CSV is just plan text, which means that it's handy in all sorts of ways. (See previous comments regarding grep)
Saturday, September 15th, 2007 11:07 pm (UTC)
Maybe I was unclear about what I'm looking for (or I've misunderstood your suggestion). In particular, I suspect that I would actually just be making a single list of data (or at most a handful of similar lists). Each list entry would be described by a handful of keywords.

So I interpreted your suggestion as "make a file for each list entry, and associate keywords to entries by creating symlinks to each file". But now I'm guessing that you thought I wanted keywords for each list, rather than for each entry.
Sunday, September 16th, 2007 08:23 pm (UTC)
Or write a 10 line python program to parse and search a csv, with the search done as code. In another 5-10 lines, you could implement a generic boolean search.
Saturday, September 15th, 2007 07:02 pm (UTC)
Yeah, I guess that throwing something together in perl (python would mean learning a new language) would be pretty easy, though if I went that route I'd probably just hard-code the list into the program file (as Jon suggested) rather than coming up with a structured text file data format and writing some kind of parser. (I've written CGI scripts using both approaches in the past.)

But actually, my "abuse BibDesk" idea would almost work this way: it stores its data in plain text BibTeX files. As long as I ignored all the LaTeX-specific stuff, it would be a decent approximation of the idealized interface that I might want to write. (I think I can even create "smart folders" there to implement boolean operations, albeit with just as much of a hassle as in iTunes.) In fact, BibDesk uses some BSD license variant: with a bit of effort, one could presumably hack it into a more general-purpose program directly. But that's overkill for me.
Sunday, September 16th, 2007 08:21 pm (UTC)
Writing something to parse a line into keywords would be trivial, literally the work of a few minutes, in a scripting language like python or perl. The days where creating and parsing a "structured text file data format" take effort are long gone, thanks to perl. So it's not like it's much work. In python, it would just be something like, if your text file format is "title, then keywords, tab separated"

----
# Read in text file
entries = []
for line in open(myfilename).readlines():
entry = {}
items = line.split('\t')
entry['__title__'] = items[0]
for item in items[1:] # keywords is everything after first column
entry[item] = 1
entries.append(entry)
# Now you have a list of dicts, and it's trivial to query:
search_results = [entry for entry in entries if (entry['foo'] and entry['bar']) or entry['yawn']]
print '\n'.join([result['__title__'] for result in search_results]
----
That's it. I think it's actually easier to write those few lines of parsing than to hard-code the data into your program. And you could write something just as short in perl, although it would be uglier and harder to read.
Sunday, September 16th, 2007 09:04 pm (UTC)
You're right: that is simple, and something like this probably is the way to go. (I do like the simple look of python, too, at least when the indentation isn't swallowed by HTML's whitespace handling. Am I right that indentation is syntactically meaningful, given the lack of loop delimiters?)

But there's still some issue of balancing simplicity of parsing/searching against simplicity of data entry, because I would like to include a handful of other bits of data along with each entry in the list. That "brief description" would probably be a block of text somewhere between a sentence and a paragraph long. Writing (or reading) that without line breaks would be a pain, so the parsing routine would need to have some way of separating entries other than "one newline each". Hence my concern that writing a sufficient program would be more than a 10 minute job. (I might be able to just steal some of my own old code, though.)
Sunday, September 16th, 2007 09:31 pm (UTC)
Yep, python uses indentation to define blocks, saving you the {}. It's an awesome language.

You could do each entry in a different file if you wanted to include more data. Say, keywords on the first line, and everything else the program treats as a big blob.
Tuesday, September 25th, 2007 08:37 pm (UTC)
As someone said above, you could easily use a spreadsheet saving files in CSV format for your data entry.

I love programming in Python. I find that most of the time the way it works is the way I think about code in my head. It has worked very well for a lot of small projects I've done at work or at home.
Saturday, September 29th, 2007 01:38 am (UTC)
this reminds me that I just saw an article today about spreadsheets for programmers, which use python instead of VB as their scripting language.
Saturday, September 15th, 2007 07:33 pm (UTC)
You know, I'm sure I'll find out what this is for at some point, but I really don't want to know. This is what happens when I leave you alone for a week. :P
Saturday, September 15th, 2007 07:37 pm (UTC)
I haven't spent any substantial time on it, honest! :) And you'll note that unlike previous crazy plans, I explicitly came into this one NOT wanting to spend any substantial time setting it up. If I can't find a quick and easy way to do it, it's not going to happen.
Saturday, September 15th, 2007 07:50 pm (UTC)
I have a program called CocoaMySQL, which is basically a gui version of MySQL. It also shows you what it's doing, so I used it as a tool to learn MySQL commands and syntax and test/debug problematic commands that were outputted by my php-based japanese-flashcards-studying "program" (if you can call a webpage that at all, ever).


On the other hand, have you ever considered...

...

...

Excel? ;)