Artwork

Контент предоставлен Scale Cast – A podcast about big data, distributed systems, and scalability. Весь контент подкастов, включая эпизоды, графику и описания подкастов, загружается и предоставляется непосредственно компанией Scale Cast – A podcast about big data, distributed systems, and scalability или ее партнером по платформе подкастов. Если вы считаете, что кто-то использует вашу работу, защищенную авторским правом, без вашего разрешения, вы можете выполнить процедуру, описанную здесь https://ru.player.fm/legal.
Player FM - приложение для подкастов
Работайте офлайн с приложением Player FM !

More Optimal Bloom Filters

 
Поделиться
 

Manage episode 60658692 series 60629
Контент предоставлен Scale Cast – A podcast about big data, distributed systems, and scalability. Весь контент подкастов, включая эпизоды, графику и описания подкастов, загружается и предоставляется непосредственно компанией Scale Cast – A podcast about big data, distributed systems, and scalability или ее партнером по платформе подкастов. Если вы считаете, что кто-то использует вашу работу, защищенную авторским правом, без вашего разрешения, вы можете выполнить процедуру, описанную здесь https://ru.player.fm/legal.

The Bloom filter, conceived by Burton H. Bloom in 1970, is a
space-efficient probabilistic data structure that is used to test
whether an element is a member of a set. False positives are possible,
but false negatives are not. Elements can be added to the set, but not
removed (though this can be addressed with a counting filter). The
more elements that are added to the set, the larger the probability of
false positives.

For example, one might use a Bloom filter to do spell-checking in a
space-efficient way. A Bloom filter to which a dictionary of correct
words has been added will accept all words in the dictionary and
reject almost all words which are not, which is good enough in some
cases. Depending on the false positive rate, the resulting data
structure can require as little as a byte per dictionary word.

In the last few years Bloom filter become hot topic again and there
were several modifications and improvements. In this talk I will
present my last few improvements in this topic.

Speaker: Ely Porat
Ely Porat received his Doctorate from Bar-Ilan University in 2000.
Following that, he fulfilled his military service and, in parallel,
worked as a faculty member at Bar-Ilan University. Having spent the
spring 2007 semester as a Visiting Scientist in Google, he is now back
at Bar-Ilan University.

The main body of Ely Porat’s work concerns matching problems: string
matching, pattern matching, subset matching. He also worked on the
nearest pair problem in high-dimensional spaces as well as sketching
and edit distance.

link

  continue reading

9 эпизодов

Artwork
iconПоделиться
 
Manage episode 60658692 series 60629
Контент предоставлен Scale Cast – A podcast about big data, distributed systems, and scalability. Весь контент подкастов, включая эпизоды, графику и описания подкастов, загружается и предоставляется непосредственно компанией Scale Cast – A podcast about big data, distributed systems, and scalability или ее партнером по платформе подкастов. Если вы считаете, что кто-то использует вашу работу, защищенную авторским правом, без вашего разрешения, вы можете выполнить процедуру, описанную здесь https://ru.player.fm/legal.

The Bloom filter, conceived by Burton H. Bloom in 1970, is a
space-efficient probabilistic data structure that is used to test
whether an element is a member of a set. False positives are possible,
but false negatives are not. Elements can be added to the set, but not
removed (though this can be addressed with a counting filter). The
more elements that are added to the set, the larger the probability of
false positives.

For example, one might use a Bloom filter to do spell-checking in a
space-efficient way. A Bloom filter to which a dictionary of correct
words has been added will accept all words in the dictionary and
reject almost all words which are not, which is good enough in some
cases. Depending on the false positive rate, the resulting data
structure can require as little as a byte per dictionary word.

In the last few years Bloom filter become hot topic again and there
were several modifications and improvements. In this talk I will
present my last few improvements in this topic.

Speaker: Ely Porat
Ely Porat received his Doctorate from Bar-Ilan University in 2000.
Following that, he fulfilled his military service and, in parallel,
worked as a faculty member at Bar-Ilan University. Having spent the
spring 2007 semester as a Visiting Scientist in Google, he is now back
at Bar-Ilan University.

The main body of Ely Porat’s work concerns matching problems: string
matching, pattern matching, subset matching. He also worked on the
nearest pair problem in high-dimensional spaces as well as sketching
and edit distance.

link

  continue reading

9 эпизодов

Все серии

×
 
Loading …

Добро пожаловать в Player FM!

Player FM сканирует Интернет в поисках высококачественных подкастов, чтобы вы могли наслаждаться ими прямо сейчас. Это лучшее приложение для подкастов, которое работает на Android, iPhone и веб-странице. Зарегистрируйтесь, чтобы синхронизировать подписки на разных устройствах.

 

Краткое руководство

Слушайте это шоу, пока исследуете
Прослушать