Wikipedia:Reference desk/Archives/Computing/2019 March 18

Computing desk
< March 17 << Feb | March | Apr >> Current desk >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


March 18

edit

How do you scan a dictionary in Python for a term based on its definition?

edit

If I have a Python program whose goal is to accept a definition as input and retrieve the relevant dictionary term, how exactly do I go about doing this?

I mean, I have a python dictionary (a very simple one) and I ask the user to write a definition that my program will accept as input. Then, I need to look through the entire dictionary to see if any of the definitions there match the definition that I have as my input. If so, then the program will need to print the dictionary term for which this definition applies. Futurist110 (talk) 22:24, 18 March 2019 (UTC)[reply]

You can either search the dictionary as you say (inefficient), or make a reverse dictionary that goes from definitions to words. Note that python dictionaries are indexed on exact values: you can't look stuff up in them by substring or whatever. Plus you have to decide what to do about definitions pointed to by more than one word. It's possible that you really want a database or search engine, rather than a dictionary. Python has an SQLite module and it's worth learning how to use it on general principles. It might also be useful for the specific problem you're asking about here. 173.228.123.166 (talk) 23:53, 18 March 2019 (UTC)[reply]
  • You have two problems. The one you haven't realised is the harder.
Going from a definition to the term requires a match function to be defined, to compare your target match definition and the definition being compared. There's also a question (to which the answer is probably no) as to whether you can compare two definitions as to which is "first" (not just equal).
If there's only a match function, then you can't sort the definitions. That means your search algorithm is quite simple, you have to scan a linear list of definitions until you find one. Which is, unfortunately, slow. If they're sortable, then you can use standard B tree (and many other) approaches to make that part quicker.
The match function though is tricky. Just what is a "definition"? A paragraph of text? How do you compare two and judge them to be equal? Will your definition be a text match over the whole paragraph? Or would you have to analyse all of them and "score" them, then find the best match? You could well be looking at algorithms and libraries like Lucene here. Potentially this is a "hard" problem (working at Google hard!). Andy Dingley (talk) 00:18, 19 March 2019 (UTC)[reply]
Thank you very much for your responses, guys! Honestly, I think that the reverse dictionary approach is the best one for me in regards to this for the time being. Futurist110 (talk) 04:27, 19 March 2019 (UTC)[reply]