Talk:Judy array

Latest comment: 2 years ago by QuantumWiz in topic Drawbacks

Independent analysis

edit

While this article is the first hit for Judy array on Google, I do not think it is very informative. Unfortunately the official documentation reads very much like advertisement and keeps telling how great Judy arrays are instead of telling what they are. I have found two analyses by third-parties though, and they provide some insight into how Judy arrays compare to better-known data structures. I do not think I could incorporate them into the article, because I did not really do my homework, but here they are anyway: [1], [2] --CyHawk (talk) 11:51, 18 August 2008 (UTC)Reply

Less Vulnerable Than What?

edit

The article says "Judy arrays are also believed to be less vulnerable to algorithmic complexity attacks". Less vulnerable than what? 71.112.25.123 (talk) 19:34, 28 October 2009 (UTC)ATBSReply

Since the line has not been explained and isn't informative, i am removing it. B.suhasini (talk) 14:50, 18 October 2011 (UTC)Reply

Less vulnerable than, for example, common implementations of hash tables, which can easily be coaxed into their quadratic worst-case performance. http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/ William Cushing — Preceding unsigned comment added by 128.32.45.10 (talk) 20:51, 20 October 2013 (UTC)Reply

Description of the Algorithm?

edit

There is no description of the data structure or of its algorithms, neither in pseudo-code nor in any programming language. Could someone familiar with it post them, a simple complexity analysis(like those in most wiki algorithm pages) would be good to. 178.166.8.164 (talk) 02:57, 5 January 2012 (UTC)Reply

Please see the implementation notes on my Judy Array project page at: http://code.google.com/p/judyarray karl m {{User electrical engineer}} (talk) 23:51, 11 January 2013 (UTC)Reply

Patent Restrictions

edit

I agree that wikipedia is not the place to issue opinions about the applicability of patents to specific situations. However, the HP patent is a very detailed description of some methods that are not particularly important to a general implemention of judy arrays -- like the myriad of methods to compress radix nodes for example. Additionally, when HP released the 25,000 lines of source code of their implementation to Source Forge, they did so with a general release of rights to subsequent usage.

With this in mind, I've made an edit to the main page to mitigate the impact of bringing up the patent question. Yes, there is a patent, but no, it is limited to its coverage of a novel HP implementation. Hope this helps,

karl m {{User electrical engineer}} (talk) 23:47, 11 January 2013 (UTC)Reply

The HP patent is listed as a drawback of Judy arrays, but the patent - as you say - is limited to the HP specifics. Since the patent bears no relation on Judy arrays generally I don't think it's relevant here and suggest removal. 15.203.169.125 (talk) 09:14, 27 August 2013 (UTC)Reply

There is no "non-HP" Judy array - even if the name is used as such. A "Judy" array is a mix of several techniques to achieve a desired space and time complexity. It's not a "basic" data structure. I would not remove the notice, as ppl who look up things on Wikipedia are likely to want to know it. Not that software patents are that applicable outside the US. 2.247.67.8 (talk) 04:06, 6 December 2015 (UTC)Reply

Judy arrays cannot replace on-disk B-trees

edit

B-trees are chiefly used when storing data on block devices (e.g. for database indices), which is e.g. why the proxy operation for analyzing time complexity is a disk seek (corresponding to a node access). Judy arrays cannot possibly compete with B-trees in this use case, so the lede is misleading. — Kallikanzaridtalk 09:45, 9 October 2013 (UTC)Reply

Large data sets efficiency

edit

The claim that Judy arrays are more efficient with large data sets compared to rival data structures is very, very dubious. Advanced data structures like Cuckoo hashing hash tables have very good performance even with load factors exceeding 90% - a far cry from hash tables usually discussed in comparison with Judy arrays, so a source for their time and space performance comparison is needed. Also, the notion of performance should be made very precise, because benchmarks show that insertions and deletions from Judy arrays are on average slower than the corresponding operations even in primitive separate chaining hash tables, and proponents of Judy arrays often admit that they really only care about lookup time and space overhead. — Kallikanzaridtalk 10:03, 9 October 2013 (UTC)Reply

Terminology

edit

Is there any point in having Terminology section in the article? It defines 3 terms that are supposedly needed to describe Judy arrays, yet they are not actually used anywhere in the article outside this very section (except for population which doesn't really deviate much from common understanding of that word and can be easily replaced with number of keys/elements). In its current form this section seems to be unrelated to the main topic of the article at all. — Preceding unsigned comment added by MwGamera (talkcontribs) 22:07, 29 December 2013 (UTC)Reply

You can removed it if you want. I just reworded it anyway since it was a close paraphrase copyright violation. Jason Quinn (talk) 21:35, 20 January 2015 (UTC)Reply

I removed it.Vluczkow (talk) 14:35, 29 June 2015 (UTC)Reply

Needs better explanations

edit

So, after reading ALL this, I'm still left with my first basic question: What IS a Judy Array?

This article only explains the properties of that array, but fails to explain what it ACTUALLY IS, other than "it's a data structure" wich was quite obvious already from the title.

I want to know what makes this different from any other array or data structure (it's not enough with: "it's more memory efficient", WHY is it more memory efficient!?).

Basically, I want more WHY-answers & HOW-answers, these are only WHAT-answers.

THANK YOU ALL. --178.251.245.195 (talk) 07:46, 15 June 2019 (UTC)Reply

Drawbacks

edit

I'm suspicious of a couple of the mentioned drawbacks.

Firstly: "Judy arrays are optimized for machines with 64 byte cache lines, making them essentially unportable without a significant rewrite.". That claim sounds way too strong. The code should still work on a platform using a different cache line size though it might not be as fast as it could be if was optimized for that size. Also, at least it's claimed that it should still make a good use of smaller cache lines (32 bytes) because it's not leaving a part of a cache line unutilized. If the cache line is larger (128 bytes) it might be slower than necessary.

Secondly: "In most applications the possible performance advantage is too small to justify the high complexity of the data structure implementation.". This shouldn't necessarily be a problem. A developer who can just use an existing implementation doesn't need to care about the complexity. The complexity should only be a problem if you're implementing a Judy array. QuantumWiz (talk) 13:24, 19 May 2022 (UTC)Reply