SWAPHI-LS is the first parallel Smith-Waterman algorithm exploiting Intel Xeon Phi clusters to accelerate the alignment of long DNA sequences. This algorithm is written in C++ (with a set of SIMD intrinsic functions), OpenMP and MPI. The performance evaluation revealed that our algorithm achieves very stable performance, and yields a performance of up to 30.1 GCUPS on a single Xeon Phi and up to 111.4 GCUPS on four Xeon Phis sharing a host (compiled by Intel C++ compiler version 14.0.1 along with OpenMPI version 1.6.5). In addition, we developed its sister program SWAPHI for very large-scale protein sequence database search with multiple shared-host Xeon Phis supported.



Other related papers


Two executable binaries will be generated after compiling and linking: swaphi-ls and mpi-swaphi-ls. The program swaphi-ls does not rely on MPI library and targets a single Xeon Phi. The progrom mpi-swaphi-ls must be compiled with an MPI library and is designed for Xeon Phi clusters.


Scoring scheme:



Installation and Usage


  1. Intel C/C++ compiler or any other C/C++ compiler that supports Xeon Phi coprocessors.
  2. A C/C++ MPI library (e.g. OpenMPI, MPICH, Intel MPI) that is compiled by the aforementioned C/C++ compiler.


Before compiling, please modify the corresponding Makefile to point to the correct compilers and libraries.

  1. To compile the non-MPI-based version, please type command "make -f Makefile.phi".
  2. To compile the MPI-based version, please type command "make -f Makefile.mphi".
  3. To compile both versions, please type command "make".

Typical Usage

Non-MPI-based program swaphi-ls

  1. export KMP_AFFINITY=balanced; swaphi-ls -i seq1.fa -j seq2.fa
  2. export KMP_AFFINITY=balanced; swaphi-ls -i seq1.fa.gz -j seq2.fa.gz -x 2

MPI-based program mpi-swaphi-ls

  1. export KMP_AFFINITY=balanced; mpirun -np 4 mpi-swaphi-ls -i seq1.fa -j seq2.fa
  2. export KMP_AFFINITY=balanced; mpirun -hostfile host.file -np 4 mpi-swaphi-ls -i seq1.fa.gz -j seq2.fa.gz

Configure hostfile for MPI-based program

when running on a Xeon Phi cluster, you must make sure that the number of MPI processes running on a node must not be more than the number of available Xeon Phis. This constraint can be ensured using a host file.

Change Log


If any questions or improvements, please contact Liu Yongchao (Email: yliu860 (at) gatech (dot) edu).