ABRomics is an online community-driven platform to scale up and improve surveillance and research on antibiotic resistance from a One Health perspective.


Genomic Workflow

The ABRomics genomic workflow, powered by Galaxy France and launched from the ABRomics platform, is designed to process and analyze bacterial genomic data through a systematic approach. It is divided into four main steps, each ensuring robust and reliable results.

Quality and Contamination Control

This initial step ensures that raw paired-end Illumina reads are of high quality and control the contamination.

Key Steps:

  1. Quality control and trimming
    • fastp (Chen et al., 2018) QC control and trimming
  2. Taxonomic assignation on trimmed data
    • Kraken2 (Wood et al., 2019) assignation
    • Bracken (Lu et al., 2017) to re-estimate abundance to the species level
    • Recentrifuge (Martı́ Jose Manuel 2019) to make a krona chart
  3. Aggregating outputs into a single JSON file
    • ToolDistillator (ABRomics consortium, 2023) to extract and aggregate information from different tool outputs to JSON parsable files


  1. Quality control:
    • quality report
    • trimmed raw reads
  2. Taxonomic assignation:
    • Tabular report of identified species
    • Tabular file with assigned read to a taxonomic level
    • Krona chart to illustrate species diversity of the sample
  3. Aggregating outputs:
    • JSON file with information about the outputs of fastp, Kraken2, Bracken, Recentrifuge


Kraken22.1.3Default.PlusPF-16 (version 2022-06-07)
Taxonomic level: Species
PlusPF-16 (version 2022-06-07)
– Default: report.json
– Optional: trimmed_R1.fastq, trimmed_R2.fastq, report.html
– Default: taxonomy_assignation.tsv
– Optional: reads_assignation.txt
– Default: output.tsv
– Optional: kraken_reestimated_report.tsv, prior of read for estimation (default 0), read length, taxonomic level
– Default: data.tsv
– Optional: report.html, stat.tsv

Genome Assembly

Once the data is cleaned, it is assembled into contigs to form a coherent genomic sequence.

Key Steps:

  1. Assembly raw reads to a final contig fasta file
    • Shovill (Seemann Torsten 2016)
  2. Quality control of the assembly
    • Quast (Gurevich et al., 2013)
    • Bandage (Wick et al., 2015) to plot assembly graph
    • Refseqmasher (Ondov et al., 2016) to identify the closed reference genome
  3. Aggregating outputs into a single JSON file
    • ToolDistillator (ABRomics consortium, 2023) to extract and aggregate information from different tool outputs to JSON parsable files


  1. Assembly:
    • Assembly with contig in fasta
    • Mapped read on assembly in bam format
    • Graph assembly in gfa format
  2. Quality of Assembly:
    • Assembly report
    • Assembly Graph
    • Tabular result of closed reference genome
  3. Aggregating outputs:
    • JSON file with information about the outputs of Shovill, Quast, Bandage, Refseqmasher


Top N matches to report: 3
– Default: contigs.fasta
– Optional: alignment.bam, contigs_graph.gfa
– Default: output.tsv
– Optional: report.html
– Default: report_info.txt
– Optional: plot.svg
– Default: results.txt

Genome Annotation

This step annotates assembled genomes and identifies key genetic elements.

Key Steps:

  1. Genomic annotation
    • Bakta (Schwengers et al., 2021) to predict CDS and small proteins (sORF)
  2. Integron identification
    • IntegronFinder2 (Néron et al., 2022) to identify CALIN elements, In0 elements, and complete integrons
  3. Plasmid gene identification
    • Plasmidfinder (Carattoli and Hasman 2020) to identify and typing plasmid sequences
  4. Inserted sequence (IS) detection
    • ISEScan (Xie and Tang 2017) to detect IS elements
  5. Aggregating outputs into a single JSON file
    • ToolDistillator (ABRomics consortium, 2023) to extract and aggregate information from different tool outputs to JSON parsable files


  1. Genomic annotation:
    • Genome annotation in tabular, gff and several other formats
    • Annotation plot
    • Nucleotide and protein sequences identified
    • Summary of genomic identified elements
  2. Integron identification:
    • Integron identification in tabular format and a summary
  3. Plasmid gene identification:
    • Plasmid gene identified and associated blast hits
  4. Inserted Element (IS) detection:
    • IS element list in tabular format
    • IS hits in fasta format
    • ORF hits in protein and nucleotide fasta format
    • IS annotation gff format
  5. Aggregating outputs:
    • JSON file with information about the outputs of Bakta, IntegronFinder2, Plasmidfinder, ISEScan


Bakta1.9.4Default (“Full” annotation)./
Thorough local detection: Yes
Search also for promoter and attI sites? Yes
PlasmidFinder2.1.6Default.commit 81c11f4 – 2023-12-04
– Default: output.json
– Optional: protein.faa, nucleotide.fna, annotation.gff3, annotation.tsv, summary.txt, Genbank file, Embl file, contigs.fasta, hypothetical_protein.fasta, hypothetical_annotation.tsv, plot.svg
– Default: output.integrons
– Optional: output.summary
– Default: output.json
– Optional: genome_hits.fasta, plasmid_hits.fasta
– Default: output.tsv
– Optional: is.fna, orf.faa, orf.fna, annotation.gff3

AMR Gene Detection

Performed in parallel with annotation, this step focuses on detecting antimicrobial resistance (AMR) genes.

Key Steps:

  1. Genomic detection
    • Antimicrobial resistance gene identification:
      • StarAMR (Bharat et al., 2022) to blast against ResFinder (Zankari et al., 2012) and PlasmidFinder (Carattoli et al., 2014) databases
      • AMRFinderPlus (Feldgarden et al., 2021) to find antimicrobial resistance genes and point mutations
    • Virulence gene identification:
      • ABRicate (Seemann Torsten 2016) with VFDB_A database
  2. Aggregating outputs into a single JSON file
    • ToolDistillator (ABRomics consortium, 2023) to extract and aggregate information from different tool outputs to JSON parsable files


  1. Genomic detection
    • Antimicrobial resistance gene identification:
      • AMR gene list
      • MLST typing
      • Plasmid gene identification
      • Blast hits
      • AMR gene fasta (assembled nucleotide sequences)
      • Point mutation list
    • Virulence gene identification:
      • Gene identification in tabular format
  2. Aggregating outputs:
    • JSON file with information about the outputs of StarAMR, AMRFinderPlus, ABRicate


Percent identity threshold for BLAST: 90.0
ResFinder: 2.4.0 – commit
e0525f2 – 2024-09-23
PointFinder: 4.1.1 – commit
694919f – 2024-08-08
PlasmidFinder: commit 4add282 – 2024-11-14
MLST version: 2.23.0
AMRFinderPlus3.12.8Default.V3.12 – 2024-05-02
ABRicate1.0.1Default (Minimum DNA %identity and %coverage: 80.0).VFDB
– Default: resfinder.tsv
– Optional: mlst.tsv, pointfinder.tsv, plasmidfinder.tsv, settings.tsv
– Default: report.tsv
– Optional: point_mutation_report.tsv, nucleotide_sequence.fasta
– Default: report.tsv


The ABRomics workflow provides a comprehensive and integrated approach for bacterial genomic data analysis. From ensuring data quality to identifying critical genes, each step is optimized to deliver actionable and well-organized results.

Useful Links

  • Galaxy France platform: A web-based platform providing access to powerful, open-source tools for large-scale genomic and metagenomic data analysis.
  • Learning Pathway “Detection of AMR genes in bacterial genomes” part of the Galaxy Training Network. This pathway provides hands-on tutorials for researchers and students interested in detecting antimicrobial resistance (AMR) genes in bacterial genomes.


