FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
High Performance Computing
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
March 13, 2008
Fast String Search on Multicore Processors

Mapping fundamental algorithms onto parallel hardware

(Page 1 of 4)
Daniele Paolo Scarpazza, Oreste Villa, and Fabrizio Petrini
Multicores are the future, and you need to start mapping basic algorithms onto this new hardware.
Daniele, Oreste, and Fabrizio were affiliated with the Pacific Northwest National Laboratory when this article was written. Daniele and Fabrizio have since moved to the Cell Solutions department of IBM. They can currently be contacted at scarpaz@scarpaz.com.


While your current PC may be dual- or quad-core, the one sitting on your desk in a couple of years will be a "many-core." Multicores are the future, and because programming them is so different than programming for traditional processors, you must prepare for the change now, especially in mapping basic algorithms efficiently onto the new hardware.

String searching is one of these basic algorithms. It has a host of applications, including search engines, network intrusion detection, virus scanners, spam filters, and DNA analysis, among others. The Cell processor, with its multiple cores, promises to speed-up string searching a lot.

In this article, we show how we mapped string searching efficiently on the Cell. We present two implementations:

  • The fast implementation supports a small dictionary size (approximately 100 patterns) and provides a throughput of 40 Gbps, which is 100 times faster than reference implementations on x86 architectures.
  • The heavy-duty implementation is slower (3.3-4.3 Gbps), but supports dictionaries with tens of thousands of strings.

This task is not trivial. We had to change our algorithm significantly to reach top performance. In particular, to exploit the memory subsystem at its best, we employ a pipelined parallelization strategy, and we shuffle the data layout to fight congestion—techniques that are unfamiliar to most programmers of traditional architectures.

Why Is String Searching So Important?

The Internet is a dirty place, with malware, spyware, spam, and viruses trying to penetrate your systems through all possible vulnerabilities. Undesired traffic cannot be filtered anymore on the basis of header information. What's needed is"deep-packet inspection, which checks the payload against a database containing signatures of a large number of threats.

But with network links getting faster (10 Gbps Ethernet, for instance), performing on-the-fly deep-packet inspection isn't child's play. For example, running the Snort (www.snort.org) intrusion-detection system on a Pentium 4 PC only ensures a filtering throughput between 200-400 Mbps. This is why network appliance manufacturers often employ specialized hardware like Field-Programmable Gate Arrays (FPGAs) or Application-Specific Instruction Processors (ASIPs). The Cell processor is a new player in this field, with the potential to offer lots of computing power at low price. In fact, its large production volumes (it's used in the Sony PlayStation 3) contribute to keeping its price low.

1 String Searching | 2 Programming the Cell: Rethinking Bottom-Up | 3 Why Aho-Corasick and the Cell Make a Good Match | 4 The Heavy-Duty Solution Next Page
RELATED ARTICLES
No Related Articles
TOP 5 ARTICLES
No Top Articles.
DR. DOBB'S CAREER CENTER
Looking for a new job? open | close
Search jobs on Dr. Dobb's TechCareers
Function:

Keyword(s):

State:  
  • Post Your Resume
  • Employers Area
  • News & Features
  • Blogs & Forums
  • Career Resources

    Browse By:
    Location | Employer | City
  • Most Recent Posts:



    MICROSITES
    FEATURED TOPIC

    ADDITIONAL TOPICS

    INFO-LINK



     




    Techweb
    Informationweek Business Technology Network
    InformationweekInformationweek 500Informationweek 500 ConferenceInformationweek AnalyticsInformationweek Events
    Informationweek MagazineGlobal CIOIWK Government ITbMightyByte and SwitchDark Reading
    Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingPlug Into The CloudDr. DobbsContentinople
    space
    TechWeb Events Network
    InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0Mobile Business ExpoNoJitter
    Black HatGTECEnergy CampCloud ConnectGov 2.0 ExpoGov 2.0 Summit
    space
    Light Reading Communications Network
    Light ReadingLight Reading AsiaUnstrungCable Digital NewsInternet EvolutionPyramid Research
    Heavy ReadingLight Reading LiveLight Reading InsiderEthrnet ExpoTelco TVTower Technology Summit
    space
    Financial Technology Network
    Advanced TradingBank Systems and TechnologyInsurance and TechnologyWall Street and TechnologyAccelerating WallstreetBST SummitBuyside Trading SummitIT Summit
    space
    Microsoft Technology Network
    MSDNTechNetTotal IT ProTotal Dev ProNET Total Dev Pro CommunitySQL Total Dev Pro Community
    space