Package: misha 5.11.10

Aviezer Lifshitz

misha: Toolkit for Analysis of Genomic Data

A toolkit for analysis of genomic data. The 'misha' package implements an efficient data structure for storing genomic data, and provides a set of functions for data extraction, manipulation and analysis. Some of the 2D genome algorithms were described in Yaffe and Tanay (2011) <doi:10.1038/ng.947>.

Authors:Misha Hoichman [aut], Aviezer Lifshitz [aut, cre], Eitan Yaffe [aut], Amos Tanay [aut], Weizmann Institute of Science [cph]

misha_5.11.10.tar.gz

misha_5.11.10.tgz(r-4.6-x86_64)misha_5.11.10.tgz(r-4.6-arm64)misha_5.11.10.tgz(r-4.5-x86_64)misha_5.11.10.tgz(r-4.5-arm64)
misha_5.11.10.tar.gz(r-4.7-arm64)misha_5.11.10.tar.gz(r-4.7-x86_64)misha_5.11.10.tar.gz(r-4.6-arm64)misha_5.11.10.tar.gz(r-4.6-x86_64)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
misha/json (API)

# Install 'misha' in R:
install.packages('misha', repos = c('https://tanaylab.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/tanaylab/misha/issues

Pkgdown/docs site:https://tanaylab.github.io

Uses libs:
  • zlib– Compression library
  • c++– GNU Standard C++ Library v3

On CRAN:

Conda:

genomic-data-analysiszlibcpp

7.84 score 4 stars 116 scripts 280 downloads 179 exports 5 dependencies

Last updated from:3ddd1e49af. Checks:11 NOTE, 1 OK, 1 FAIL. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-arm64NOTE388
linux-devel-x86_64NOTE396
source / vignettesOK658
linux-release-arm64NOTE388
linux-release-x86_64NOTE368
macos-release-arm64NOTE320
macos-release-x86_64NOTE785
macos-oldrel-arm64NOTE313
macos-oldrel-x86_64NOTE634
windows-develNOTE83
windows-releaseNOTE87
windows-oldrelNOTE81
wasm-releaseFAIL161

Exports:.misha%>%gbins.quantilesgbins.summarygcis_decaygcluster.rungcompute_strands_autocorrgcorgdataset.example_pathgdataset.infogdataset.loadgdataset.lsgdataset.savegdataset.unloadgdb.build_genomegdb.convert_to_indexedgdb.creategdb.create_genomegdb.create_linkedgdb.export_fastagdb.genome_infogdb.get_readonly_attrsgdb.infogdb.initgdb.init_examplesgdb.install_gff3_convertergdb.install_gtf_convertergdb.install_intervalsgdb.list_genomesgdb.mark_cache_dirtygdb.reloadgdb.set_readonly_attrsgdb.unloadgdir.cdgdir.creategdir.cwdgdir.rmgdistgextractggenome.implantggenome.transplantgintervalsgintervals.2dgintervals.2d.allgintervals.2d.band_intersectgintervals.2d.convert_to_indexedgintervals.2d.intersectgintervals.2d.uniongintervals.allgintervals.annotategintervals.as_chaingintervals.attr.exportgintervals.attr.getgintervals.attr.importgintervals.attr.setgintervals.canonicgintervals.chrom_sizesgintervals.convert_to_indexedgintervals.coverage_fractiongintervals.covered_bpgintervals.datasetgintervals.dbsgintervals.diffgintervals.existsgintervals.force_rangegintervals.from_matgintervals.from_stringsgintervals.import_bedgintervals.import_genesgintervals.import_gffgintervals.import_vcfgintervals.intersectgintervals.is.bigsetgintervals.liftovergintervals.loadgintervals.load_chaingintervals.lsgintervals.mapplygintervals.mark_overlapsgintervals.neighborsgintervals.neighbors.directionalgintervals.neighbors.downstreamgintervals.neighbors.upstreamgintervals.normalizegintervals.pathgintervals.quantilesgintervals.randomgintervals.rbindgintervals.rmgintervals.savegintervals.summarygintervals.to_matgintervals.uniongintervals.updategiterator.cartesian_gridgiterator.intervalsglookupgpartitiongquantilesgrevcompgsamplegscreengsegmentgseq.compgseq.extractgseq.kmergseq.kmer.distgseq.pwmgseq.pwm_editsgseq.read_homergseq.read_jaspargseq.read_memegseq.revgseq.revcompgsetrootgsummarygsynth.bin_mapgsynth.cell_mergegsynth.convertgsynth.forbid_kmergsynth.loadgsynth.randomgsynth.replace_kmergsynth.samplegsynth.savegsynth.scoregsynth.traingtrack.2d.convert_to_indexedgtrack.2d.creategtrack.2d.importgtrack.2d.import_contactsgtrack.array.extractgtrack.array.get_colnamesgtrack.array.importgtrack.array.set_colnamesgtrack.attr.exportgtrack.attr.getgtrack.attr.importgtrack.attr.setgtrack.convertgtrack.convert_to_indexedgtrack.copygtrack.creategtrack.create_densegtrack.create_dirsgtrack.create_pwm_energygtrack.create_sparsegtrack.datasetgtrack.dbsgtrack.existsgtrack.export_bedgraphgtrack.export_bigwiggtrack.importgtrack.import_mappedseqgtrack.import_setgtrack.infogtrack.liftovergtrack.lookupgtrack.lsgtrack.modifygtrack.mvgtrack.pathgtrack.rmgtrack.smoothgtrack.var.getgtrack.var.lsgtrack.var.rmgtrack.var.setgvtrack.array.slicegvtrack.cleargvtrack.creategvtrack.filtergvtrack.infogvtrack.iteratorgvtrack.iterator.2dgvtrack.lsgvtrack.rmgwgetgwilcox

Dependencies:curldigestmagrittrpsyaml

Genomes
Create a misha database from UCSC | hg19 | hg38 | mm9 | mm10 | mm39 | Building from UCSC mammal hubs (Zoonomia)

Last update: 2026-05-12
Started: 2023-09-05

Database Formats and Multi-Contig Support
Overview | Key Features | Database Formats | Indexed Format (Recommended) | Per-Chromosome Format (Legacy) | Creating Databases | New Databases (Indexed Format) | Force Legacy Format | Checking Database Format | Converting Databases | Convert Entire Database | Convert Individual Tracks | Convert Intervals | Migration Guide | When to Migrate | Migration Workflow | Copying Tracks Between Databases | Method 1: Export and Import | Method 2: Batch Copy | Performance Comparison | Operations Faster with Indexed Format | Operations Similar Performance | Summary | Backward Compatibility | Fully Compatible | Example: Mixed Environment | Troubleshooting | "File descriptor limit reached" | "Track not found after copying files" | "Conversion fails with disk space error" | Best Practices | For New Projects | For Existing Projects

Last update: 2026-03-03
Started: 2025-11-26

Manual
Package 'misha' - User Manual | Genomic Database | Dataset API | Working Database vs Datasets | Track Resolution and Collision Handling | Cross-Database Track Expressions | Querying Track and Interval Sources | Creating and Sharing Datasets | Creating a Linked Database | Unloading Datasets | Moving and Copying Tracks | Virtual Tracks Across Sources | Backward Compatibility | File Formats | chrom_sizes.txt | Seq Files | Indexed Format (Recommended) | Per-Chromosome Format | Track Files | Indexed Format (Recommended for Multi-Contig Genomes) | PSSM Set | PSSM Key | PSSM Data | Intervals | 1D Intervals | 2D Intervals | Interval Sets | Dual Intervals | Serializing Intervals, Big and Small Interval Sets | Interval Set Storage Formats | Tracks | 1D Track | Array Track | 2D Track | Track as an Intervals Set | Track Attributes | Track Variables | Track Attributes vs. Track Variables | Track Expressions | Introduction | Virtual Tracks | Value-Based Tracks | Administrating Virtual Tracks | Track Expression Evaluation under Optimization | Revealing Current Iterator Interval | Iterators | Scope | Band | Random Algorithms | Multitasking | Controlling the Number of Processes | Auto-Configuration | Limiting the Memory Consumption | Other Considerations

Last update: 2026-03-03
Started: 2023-09-05

Misha Basics (Short Guide)
The Core Idea | Four Concepts You Need First | 1) Track | 2) Intervals | 3) Iterator | 4) Virtual Track | Minimal Workflow | PWM in One Minute

Last update: 2026-02-16
Started: 2026-02-16

Readme and manuals

Help Manual

Help pageTopics
Toolkit for analysis of genomic datamisha-package misha
Calculates quantiles of a track expression for binsgbins.quantiles
Calculates summary statistics of a track expression for binsgbins.summary
Calculates distribution of contact distancesgcis_decay
Runs R commands on a clustergcluster.run
Computes auto-correlation between the strands for a file of mapped sequencesgcompute_strands_autocorr
Calculates correlation between track expressionsgcor
Create an example dataset on the flygdataset.example_path
Get dataset informationgdataset.info
Load a dataset into the namespacegdataset.load
List working database and loaded datasetsgdataset.ls
Save a datasetgdataset.save
Unload a dataset from the namespacegdataset.unload
Build a misha genome database from a namegdb.build_genome
Change Database to Indexed Genome Formatgdb.convert_to_indexed
Creates a new Genomic Databasegdb.create
Create and Load a Genome Databasegdb.create_genome
Create a linked database with symlinks to a parent databasegdb.create_linked
Export a database genome as FASTAgdb.export_fasta
Inspect a resolved genome recipe without buildinggdb.genome_info
Returns a list of read-only track attributesgdb.get_readonly_attrs
Get Database Informationgdb.info
Initializes connection with Genomic Databasegdb.init gsetroot
Initialise the example Genomic Databasegdb.init.examples gdb.init_examples
Pre-install UCSC's gff3ToGenePred binarygdb.install_gff3_converter
Pre-install UCSC's gtfToGenePred binarygdb.install_gtf_converter
Install interval sets onto an existing grootgdb.install_intervals
List resolvable genome namesgdb.list_genomes
Mark cached track list as dirtygdb.mark_cache_dirty
Reloads database from the diskgdb.reload
Sets read-only track attributesgdb.set_readonly_attrs
Unloads the genome databasegdb.unload
Changes current working directory in Genomic Databasegdir.cd
Creates a new directory in Genomic Databasegdir.create
Returns the current working directory in Genomic Databasegdir.cwd
Deletes a directory from Genomic Databasegdir.rm
Calculates distribution of track expressionsgdist
Returns evaluated track expressiongextract
Implant donor sequences into a reference genomeggenome.implant
Transplant sequences from one genome into anotherggenome.transplant
Creates a set of 1D intervalsgintervals
Creates a set of 2D intervalsgintervals.2d
Returns 2D intervals that cover the whole genomegintervals.2d.all
Intersects two-dimensional intervals with a bandgintervals.2d.band_intersect
Convert 2D interval set to indexed formatgintervals.2d.convert_to_indexed
Intersects two sets of 2D intervalsgintervals.2d.intersect
Unites two sets of 2D intervalsgintervals.2d.union
Returns 1D intervals that cover the whole genomegintervals.all
Annotates 1D intervals using nearest neighborsgintervals.annotate
Transforms existing intervals to a chain formatgintervals.as_chain
Returns interval set attributes valuesgintervals.attr.export
Returns value of an interval set attributegintervals.attr.get
Imports interval set attributes valuesgintervals.attr.import
Assigns value to an interval set attributegintervals.attr.set
Converts intervals to canonic formgintervals.canonic
Returns number of intervals per chromosomegintervals.chrom_sizes
Convert 1D interval set to indexed formatgintervals.convert_to_indexed
Calculate fraction of genomic space covered by intervalsgintervals.coverage_fraction
Calculate total base pairs covered by intervalsgintervals.covered_bp
Returns the database/dataset path for interval setsgintervals.dataset
Returns all database paths containing an interval setgintervals.dbs
Calculates difference of two intervals setsgintervals.diff
Tests for a named intervals set existencegintervals.exists
Limits intervals to chromosomal rangegintervals.force_range
Convert an interval-indexed matrix back to an intervals + values data.framegintervals.from_mat
Creates 1D intervals from coordinate stringsgintervals.from_strings
Import intervals from a BED filegintervals.import_bed
Imports genes and annotations from filesgintervals.import_genes
Import intervals from a GFF/GTF filegintervals.import_gff
Import intervals from a VCF filegintervals.import_vcf
Calculates an intersection of two sets of intervalsgintervals.intersect
Tests for big intervals setgintervals.is.bigset
Converts intervals from another assemblygintervals.liftover
Loads a named intervals setgintervals.load
Loads assembly conversion table from a chain filegintervals.load_chain
Returns a list of named intervals setsgintervals.ls
Applies a function to values of track expressionsgintervals.mapply
Mark overlapping intervals with a group IDgintervals.mark_overlaps
Finds neighbors between two sets of intervalsgintervals.neighbors
Directional neighbor finding functionsgintervals.neighbors.directional gintervals.neighbors.downstream gintervals.neighbors.upstream
Normalize intervals to fixed or variable sizesgintervals.normalize
Returns the path on disk of an interval setgintervals.path
Calculates quantiles of a track expression for intervalsgintervals.quantiles
Generate random genome intervalsgintervals.random
Combines several sets of intervalsgintervals.rbind
Deletes a named intervals setgintervals.rm
Creates a named intervals setgintervals.save
Calculates summary statistics of track expression for intervalsgintervals.summary
Convert intervals + values data.frame to an interval-indexed matrixgintervals.to_mat
Calculates a union of two sets of intervalsgintervals.union
Updates a named intervals setgintervals.update
Creates a cartesian-grid iteratorgiterator.cartesian_grid
Returns iterator intervalsgiterator.intervals
Returns values from a lookup table based on track expressionglookup
Partitions the values of track expressiongpartition
Calculates quantiles of a track expressiongquantiles
Get reverse complement of DNA sequencegrevcomp
Returns samples from the values of track expressiongsample
Finds intervals that match track expressiongscreen
Divides track expression into segmentsgsegment
Complement DNA sequencegseq.comp
Returns DNA sequencesgseq.extract
Score DNA sequences with a k-mer over a region of interestgseq.kmer
Compute k-mer distribution in genomic intervalsgseq.kmer.dist
Score DNA sequences with a PWM over a region of interestgseq.pwm
Show optimal edits to reach a PWM score thresholdgseq.pwm_edits
Read motifs from a HOMER motif format filegseq.read_homer
Read motifs from a JASPAR PFM format filegseq.read_jaspar
Read motifs from a MEME minimal motif format filegseq.read_meme
Reverse DNA sequencegseq.rev
Get reverse complement of DNA sequencegseq.revcomp
Calculates summary statistics of track expressiongsummary
Create a bin mapping from value-based merge specificationsgsynth.bin_map
Resolve a cell-level merge specification into flat bin indicesgsynth.cell_merge
Convert a legacy RDS gsynth model to .gsm formatgsynth.convert
Forbid a k-mer pattern in a trained gsynth modelgsynth.forbid_kmer
Load a gsynth.model from diskgsynth.load
Generate random genome sequencesgsynth.random
Iteratively replace a k-mer in the genomegsynth.replace_kmer
Sample a synthetic genome from a trained Markov modelgsynth.sample
Save a gsynth.model to disk in .gsm formatgsynth.save
Score the genome under a trained gsynth modelgsynth.score
Train a stratified Markov model from genome sequencesgsynth.train
Convert 2D track to indexed formatgtrack.2d.convert_to_indexed
Creates a 'Rectangles' track from intervals and valuesgtrack.2d.create
Creates a 2D track from tab-delimited filegtrack.2d.import
Creates a track from a file of inter-genomic contactsgtrack.2d.import_contacts
Returns values from 'Array' trackgtrack.array.extract
Returns column names of array trackgtrack.array.get_colnames
Creates an array track from array tracks or filesgtrack.array.import
Sets column names of array trackgtrack.array.set_colnames
Returns track attributes valuesgtrack.attr.export
Returns value of a track attributegtrack.attr.get
Imports track attributes valuesgtrack.attr.import
Assigns value to a track attributegtrack.attr.set
Converts a track to the most current formatgtrack.convert
Convert a track to indexed formatgtrack.convert_to_indexed
Copies one or more tracksgtrack.copy
Creates a track from a track expressiongtrack.create
Creates a 'Dense' track from intervals and valuesgtrack.create_dense
Create directories needed for track creationgtrack.create_dirs
Creates a new track from PSSM energy functiongtrack.create_pwm_energy
Creates a 'Sparse' track from intervals and valuesgtrack.create_sparse
Returns the database/dataset path for a trackgtrack.dataset
Returns the database paths that contain track(s)gtrack.dbs
Tests for a track existencegtrack.exists
Export a track to bedGraph formatgtrack.export_bedgraph
Export a track to BigWig formatgtrack.export_bigwig
Creates a track from WIG / BigWig / BedGraph / BED / tab-delimited filegtrack.import
Creates a track from a file of mapped sequencesgtrack.import_mappedseq
Creates one or more tracks from multiple WIG / BigWig / BedGraph / tab-delimited files on disk or FTPgtrack.import_set
Returns information about a trackgtrack.info
Imports a track from another assemblygtrack.liftover
Creates a new track from a lookup table based on track expressiongtrack.lookup
Returns a list of track namesgtrack.ls
Modifies track contentsgtrack.modify
Renames or moves a trackgtrack.mv
Returns the path on disk of a trackgtrack.path
Deletes a trackgtrack.rm
Creates a new track from smoothed values of track expressiongtrack.smooth
Returns value of a track variablegtrack.var.get
Returns a list of track variables for a trackgtrack.var.ls
Deletes a track variablegtrack.var.rm
Assigns value to a track variablegtrack.var.set
Defines rules for a single value calculation of a virtual 'Array' trackgvtrack.array.slice
Deletes all virtual tracksgvtrack.clear
Creates a new virtual trackgvtrack.create
Attach or clear a genomic mask filter on a virtual trackgvtrack.filter
Returns the definition of a virtual trackgvtrack.info
Defines modification rules for a one-dimensional iterator in a virtual trackgvtrack.iterator
Defines modification rules for a two-dimensional iterator in a virtual trackgvtrack.iterator.2d
Returns a list of virtual track namesgvtrack.ls
Deletes a virtual trackgvtrack.rm
Downloads files from FTP servergwget
Calculates Wilcoxon test on sliding windows over track expressiongwilcox
Print summary of a gsynth.modelprint.gsynth.model