If you’re using the Tulin et al. data provided in the snapshot above, you should see a bunch of files like:If it looks like it contains the right commands, you can run it by doing:
Nanopore sequencing is still rapidly maturing and we believe that advancements in sequencing chemistries, nanopore design and analysis algorithms will vastly improve the technology and address the shortcomings of low read numbers and high error rates in the near future. Lower error-rates will, for example, allow us to improve the pipeline further by enabling the base accurate identification of TSS/TES and splice sites, instead of identifying 20 bp bins for these features. Even with its current limitations,
Reads were searched against uniref100 () (accessed 20151020) using DIAMOND v0.7.12 () with the BLASTX option. The top hit of each read (if above 1e-3) was mapped to KEGG Orthology (KO) IDs using the Uniprot ID mapping files. to each KO were summed to produce a count table. Correlations and significance tests were performed with R () after applying and a cut-off &gt 500.
Thanks a lot for the answer! I have since spoke to some colleagues who also recommended cutadapt. Looks like that is the better option.