Hi,
I'm fiddling with some little project where I need to count the number of bits that are the same between two bitstrings. My first approximation would be to bitXnor: both integers and count bits in the result. However, there's no bitXnor: and the standard implementation: a XNOR b = (a AND b) OR (NOT a AND NOT b) won't fly because there's not bitNot in Integer, as far as I can tell (negate draws in all the ugliness of 2-complement arithmetic :)). It's probably a stupid question that only shows how rusty my bit fiddling is - but what would be a *good* performing implementation of something that counts matching bits? (for extra credits: if the two integers are unequal in size, it's a pattern match so the smaller int needs to be shifted along the larger int and the largest count is the one that's returned): 011101 / 101 -> 3 011111 / 101 -> 2 0101 / 0001 -> 3 ... etcetera) -- "Human beings make life so interesting. Do you know, that in a universe so full of wonders, they have managed to invent boredom. " - Death, in "The Hogfather" |
Cees de Groot wrote:
> Hi, > > I'm fiddling with some little project where I need to count the number > of bits that are the same between two bitstrings. My first > approximation would be to bitXnor: both integers and count bits in the > result. However, there's no bitXnor: and the standard implementation: > > a XNOR b = (a AND b) OR (NOT a AND NOT b) > > won't fly because there's not bitNot in Integer, as far as I can tell > (negate draws in all the ugliness of 2-complement arithmetic :)). There's bitInvert, I think (I have no Squeak image here). If you don't have it, bitInvert is just "-1 - self". > It's probably a stupid question that only shows how rusty my bit > fiddling is - but what would be a *good* performing implementation of > something that counts matching bits? You need to give the width, as (if both numbers have the same sign, at least) you can always add a zero to the left and get one more matching bit. :-) So the method would be something like this: bitMatch: b width: n ^((self bitXor: b) bitInvert bitAnd: (1 bitShift: n) - 1) bitCount bitCount | n cnt | n := self. cnt := 0. [ n = 0 ] whileFalse: [ cnt := cnt + 1. n := n bitAnd: n - 1 ]. ^cnt > 0101 / 0001 -> 3 ^2r0101 bitMatch: 2r0001 width: 4 > integers are unequal in size, it's a pattern match so the smaller int > needs to be shifted along the larger int and the largest count is the > one that's returned): > > 011101 / 101 -> 3 > 011111 / 101 -> 2 I don't know the answer for this, but I know of an efficient algorithm for a similar problem, that is to *find* a bitstring in another. For this, you can adapt the Knuth-Morris-Pratt string searching algorithm. KMP works by computing the longest prefix corresponding to all the partial match (e.g. if you've matched six chars in 0101011 and fail on the seventh char, you can try again with a four-character partial match). You can modify the algorithm in two ways: 1) you can optimize the prefix table because you only have two values. In the previous case, if you failed to match the seventh char in 0101011, since you wanted a one you know that the target bitstring had a zero. So you can continue with a five-character partial match. 2) when you find a match, if you want overlapping matches you can use the prefix table to find them. For example, if you are looking for is 0101, the KMP prefix table (not the one modified according to optimization 1) says that after failing to match the fourth character you can try matching against the second. So, that's what you do even after finding a full match. Optimization 1 can also be applied to this table, though in a slightly different way. You can find code for KMP inside VisualWorks or GNU Smalltalk (see Stream>>#nextPutAll:). Paolo |
On Dec 17, 2007 1:34 PM, Paolo Bonzini <[hidden email]> wrote:
> There's bitInvert, I think (I have no Squeak image here). If you don't > have it, bitInvert is just "-1 - self". > Yup - and that draws in a sign and other ugliness. At least, in a direct implementation of bitXnor: bitXnor: anInteger " return bit-wise XNOR of two numbers. " ^ (self bitAnd: anInteger) bitOr: (self bitInvert bitAnd: anInteger bitInvert) 0 bitXnor: 0 -> -1 > You need to give the width, as (if both numbers have the same sign, at > least) you can always add a zero to the left and get one more matching > bit. :-) So the method would be something like this: > > bitMatch: b width: n > ^((self bitXor: b) bitInvert bitAnd: (1 bitShift: n) - 1) > bitCount > > bitCount > | n cnt | > n := self. cnt := 0. > [ n = 0 ] whileFalse: [ > cnt := cnt + 1. > n := n bitAnd: n - 1 ]. > ^cnt > > I don't know the answer for this, but I know of an efficient algorithm > for a similar problem, that is to *find* a bitstring in another. For > this, you can adapt the Knuth-Morris-Pratt string searching algorithm. > get the bitcounting efficient the couple of loop/shift operations needed to shift the shorter pattern along the larger one should be ok. Thanks for the suggested implementation - i'll throw it at my unit tests :-) (if you think - wth is that guy needing that stuff for - the bit strings are genomes, and i want to make some decisions in an artificial life algorithm based on 'fit' - how well the genomes match (and also on 'non-fit' - how far they are apart, but that is step 2). Just something to keep me busy in the train commute) -- "Human beings make life so interesting. Do you know, that in a universe so full of wonders, they have managed to invent boredom. " - Death, in "The Hogfather" |
On Dec 17, 2007, at 18:29 , Cees de Groot wrote:
> On Dec 17, 2007 1:34 PM, Paolo Bonzini <[hidden email]> wrote: >> There's bitInvert, I think (I have no Squeak image here). If you >> don't >> have it, bitInvert is just "-1 - self". >> > Yup - and that draws in a sign and other ugliness. At least, in a > direct implementation of bitXnor: > > bitXnor: anInteger > " return bit-wise XNOR of two numbers. " > > ^ (self bitAnd: anInteger) bitOr: (self bitInvert bitAnd: > anInteger bitInvert) > > 0 bitXnor: 0 -> -1 And what exactly is wrong with that? - Bert - |
On Dec 17, 2007 1:49 PM, Bert Freudenberg <[hidden email]> wrote:
> > 0 bitXnor: 0 -> -1 > > And what exactly is wrong with that? > err... I'd expect '1' here? The problem is that bitInvert is not identical, as far as I can tell, with bitNot... 1 bitInvert -> -2 (where '1 bitNot' should result in 0) And for the problem itself: drawing a quick list of truth tables of available bit ops quickly pointed out that counting similar bits with XNOR and counting different bits with XOR are equivalent ;-P. That's what you get for firing off posts at the end of your lunch break. Making the number of matching bits a similarity indicated based on the length of the largest bit string, I currently have (pending more unit tests ;-)): findSimilarityBetween: n1 and: n2 "check how similar n1 and n2 are. Similarity is defined of the maximum matching number of bits - if n1 << n2, we'll 'slide' n1 along n2'''" | small large match thisMatch thisMatchBitcount nBits | (n1 = n2) ifTrue: [^1.0]. (n1 < n2) ifTrue: [small := n1. large := n2] ifFalse: [small := n2. large := n1]. match := 0. nBits := large highBitOfMagnitude. 1 to: nBits do: [:i | thisMatch := small bitXor: large. thisMatchBitcount := nBits - (self countBits: thisMatch). (thisMatchBitcount > match) ifTrue: [match := thisMatchBitcount]. small := small << 1]. ^match / (nBits * 1.0) -- "Human beings make life so interesting. Do you know, that in a universe so full of wonders, they have managed to invent boredom. " - Death, in "The Hogfather" |
Cees de Groot wrote:
> On Dec 17, 2007 1:49 PM, Bert Freudenberg <[hidden email]> wrote: >>> 0 bitXnor: 0 -> -1 >> And what exactly is wrong with that? >> > err... I'd expect '1' here? No. If you have arbitrary-precision integers, numbers like "infinitely many ones plus something else" are negative; numbers like "infinitely many zeros plus something else" are positive. 0 is infinitely many zeros, -1 is infinitely many ones -- and XNORing each of the infinitely many zeros with itself produces infinitely many ones, right? The problem is that your bitstrings have a length beyond which all bits should be considered zeros; integers have not. I hope this is understandable... Paolo |
In reply to this post by Cees De Groot
>>>>> "Cees" == Cees de Groot <[hidden email]> writes:
Cees> Hi, Cees> I'm fiddling with some little project where I need to count the number Cees> of bits that are the same between two bitstrings. My first Cees> approximation would be to bitXnor: both integers and count bits in the Cees> result. However, there's no bitXnor: and the standard implementation: Cees> a XNOR b = (a AND b) OR (NOT a AND NOT b) Cees> won't fly because there's not bitNot in Integer, as far as I can tell Cees> (negate draws in all the ugliness of 2-complement arithmetic :)). That looks like NOT (a XOR b) to me. Wouldn't that already have a fast implementation? -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! |
In reply to this post by Paolo Bonzini-2
I *understand* it, but it's not what I *want* :).
(granted, "expect" is not the right word) On Dec 17, 2007 5:05 PM, Paolo Bonzini <[hidden email]> wrote: > Cees de Groot wrote: > > On Dec 17, 2007 1:49 PM, Bert Freudenberg <[hidden email]> wrote: > >>> 0 bitXnor: 0 -> -1 > >> And what exactly is wrong with that? > >> > > err... I'd expect '1' here? > > No. If you have arbitrary-precision integers, numbers like "infinitely > many ones plus something else" are negative; numbers like "infinitely > many zeros plus something else" are positive. 0 is infinitely many > zeros, -1 is infinitely many ones -- and XNORing each of the infinitely > many zeros with itself produces infinitely many ones, right? > > The problem is that your bitstrings have a length beyond which all bits > should be considered zeros; integers have not. > > I hope this is understandable... > > Paolo > > > -- "Human beings make life so interesting. Do you know, that in a universe so full of wonders, they have managed to invent boredom. " - Death, in "The Hogfather" |
Cees de Groot wrote:
> I *understand* it, but it's not what I *want* :). The point is that, there is no reason why 0 bitInvert should return 1 rather than 3, 7, or 15. Your data type is a BitArray, but if you encode it as an Integer, that's not enough. You need another instance variable to store the width or alternatively a mask (i.e. (1 bitShift: width) - 1). Then, you have something like this: (the bitOr:'s in the masks are not typos). & aBitString ^BitString value: (self value bitAnd: aBitString value) mask: (self mask bitOr: aBitString mask) | aBitString ^BitString value: (self value bitOr: aBitString value) mask: (self mask bitOr: aBitString mask) invert ^BitString value: (self value bitXor: self mask) mask: self mask xor: aBitString ^BitString value: (self value bitXor: aBitString value) mask: (self mask bitOr: aBitString mask) xnor: aBitString ^(self xor: aBitString) invert count | n cnt | n := self value. cnt := 0. [ n = 0 ] whileFalse: [ cnt := cnt + 1. n := n bitAnd: n - 1 ]. ^cnt Paolo |
Use a Bitmap and BitBLT. Dan put loads of stuff in there to mangle bits
tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: PWB: Put to Waste Basket |
In reply to this post by Cees De Groot
On Mon, Dec 17, 2007 at 01:00:41PM +0100, Cees de Groot wrote:
> It's probably a stupid question that only shows how rusty my bit > fiddling is - but what would be a *good* performing implementation of > something that counts matching bits? (for extra credits: if the two > integers are unequal in size, it's a pattern match so the smaller int > needs to be shifted along the larger int and the largest count is the > one that's returned): doing a google search for "fast bit counting" brings up http://infolab.stanford.edu/~manku/bitcount/bitcount.html as the first hit. The first algorithm listed takes time proportional to the number of 1 bits in the integer, rather than the length of the bit encoding. It relies on the ability to do a fast zero check, though, as the loop exit condition -- Matthew Fulmer -- http://mtfulmer.wordpress.com/ Help improve Squeak Documentation: http://wiki.squeak.org/squeak/808 |
In reply to this post by timrowledge
> Use a Bitmap and BitBLT. Dan put loads of stuff in there to mangle bits
Yes. There is a SqueakMap entry called BitArray that utilizes Bitmap and BitBlt, and among others, #cardinality returns the number of set bits. I haven't quite understood Cees' requirement, (is it effectively to find a substring or not?) but BitArray might help. -- Yoshiki |
In reply to this post by Cees De Groot
Cees de Groot a écrit :
> Hi, > > I'm fiddling with some little project where I need to count the number > of bits that are the same between two bitstrings. My first > approximation would be to bitXnor: both integers and count bits in the > result. However, there's no bitXnor: and the standard implementation: > > a XNOR b = (a AND b) OR (NOT a AND NOT b) > > won't fly because there's not bitNot in Integer, as far as I can tell > (negate draws in all the ugliness of 2-complement arithmetic :)). > > It's probably a stupid question that only shows how rusty my bit > fiddling is - but what would be a *good* performing implementation of > something that counts matching bits? (for extra credits: if the two > integers are unequal in size, it's a pattern match so the smaller int > needs to be shifted along the larger int and the largest count is the > one that's returned): > > 011101 / 101 -> 3 > 011111 / 101 -> 2 > 0101 / 0001 -> 3 > ... etcetera) > I don't understand how you could achieve that with Integer 0101 / 0001 -> 3 00101 / 00001 -> 4 000101 / 000001 -> 5 Same integer, different results... So yes, BitArray as suggested by Yoshiki is maybe your option Nicolas |
In reply to this post by Cees De Groot
On Dec 17, 2007, at 19:04 , Cees de Groot wrote:
> On Dec 17, 2007 1:49 PM, Bert Freudenberg <[hidden email]> > wrote: >>> 0 bitXnor: 0 -> -1 >> >> And what exactly is wrong with that? >> > err... I'd expect '1' here? How should that method know you are looking for 1-bit numbers only? In your logic 2r0 bitXnor: 2r0 -> 2r1 2r00 bitXnor: 2r00 -> 2r11 which would be rather weird, wouldn't it? -1 in two's complement is an endless string of 1s, so that's the Right Thing. If you want a "fixed-width" number then you can just #bitAnd: with the number of bits you like: -1 bitAnd: 2r1 -> 2r1 -1 bitAnd: 2r11 -> 2r11 - Bert - |
In reply to this post by Paolo Bonzini-2
On Dec 17, 2007 6:25 PM, Paolo Bonzini <[hidden email]> wrote:
> Your data type is a BitArray, but if you encode it as an Integer, that's > not enough. You need another instance variable to store the width or > alternatively a mask (i.e. (1 bitShift: width) - 1). Then, you have > something like this: > [...] thanks. You're right, of course. I'm trying to stuff square pegs in round holes here. That's probably what you get when your daytime job largely plays in PHP-land... Tim - thanks for the tip on Bitblt, I would have never thought to look there (graphics stuff scares me ;-)) -- "Human beings make life so interesting. Do you know, that in a universe so full of wonders, they have managed to invent boredom. " - Death, in "The Hogfather" |
In reply to this post by Tapple Gao
I typed 'counting bits' on Google and found the excellent page
http://graphics.stanford.edu/~seander/bithacks.html Nice to see that a trivial space/time trade-off works soo much better than funky algorithms, by the way :) On Dec 17, 2007 6:37 PM, Matthew Fulmer <[hidden email]> wrote: > doing a google search for "fast bit counting" brings up > http://infolab.stanford.edu/~manku/bitcount/bitcount.html as the > first hit. The first algorithm listed takes time proportional to > the number of 1 bits in the integer, rather than the length of > the bit encoding. It relies on the ability to do a fast zero > check, though, as the loop exit condition > > -- > Matthew Fulmer -- http://mtfulmer.wordpress.com/ > Help improve Squeak Documentation: http://wiki.squeak.org/squeak/808 > > -- "Human beings make life so interesting. Do you know, that in a universe so full of wonders, they have managed to invent boredom. " - Death, in "The Hogfather" |
Free forum by Nabble | Edit this page |