!sfacts model_info default

Defined in: sfacts.model_zoo.model5
Summary: Metagenotype model intended for inference, designed for simplicity and speed.

Just like model4 but with uniform sequencing error.

    
Default Hyperparameters: {'gamma_hyper': 1e-10, 'rho_hyper': 0.2, 'rho_hyper2': 1.0, 'pi_hyper': 0.5, 'pi_hyper2': 0.9, 'mu_hyper_mean': 1.0, 'mu_hyper_scale': 1.0, 'm_hyper_concentration': 1.0, 'epsilon_hyper_mode': 0.01, 'epsilon_hyper_spread': 1.5}


!sfacts fit \
    --verbose \
    --num-strains 15 \
    --random-seed 0 \
    sim.filt.ss-0.mgen.nc sim.filt.ss-0.fit.world.nc

2023-05-19 15:25:49,962 START: Fitting 15 strains with data shape Frozen({'sample': 50, 'position': 500, 'allele': 2}).
  0%|| 0/1000000 [00:00<?, ?it/s]/Users/byronsmith/Projects/StrainFacts/sfacts/model_zoo/components.py:310: UserWarning: Using LogTriangle as an approximation for random sampling. This is probably a bad idea.
  warnings.warn(
  1%|| 7399/1000000 [00:42<1:35:32, 173.15it/s, NLP=+1.11e+05, delta=+0.00e+00, lag100=+0.00e+00, lr=+7.63e-07]
2023-05-19 15:26:32,715 Converged: NLP=1.11292e+05
/Users/byronsmith/Projects/StrainFacts/sfacts/model_zoo/components.py:310: UserWarning: Using LogTriangle as an approximation for random sampling. This is probably a bad idea.
  warnings.warn(
2023-05-19 15:26:32,734 END: Fitting 15 strains with data shape Frozen({'sample': 50, 'position': 500, 'allele': 2}). (took 43 seconds)
2023-05-19 15:26:32,742 Average metagenotype error: 0.01823714681497938


import sfacts as sf

fit = sf.data.World.load('sim.filt.ss-0.fit.world.nc')

# Plot inferred relative abundances for each sample (the "community").
sf.plot.plot_community(
    fit,
    col_linkage_func=lambda w: w.metagenotype.linkage('sample'),
    row_linkage_func=lambda w: w.genotype.linkage('strain'),
)

# Plot the inferred genotype of the 10 simulated strains.
sf.plot.plot_genotype(
    fit,
    row_linkage_func=lambda w: w.genotype.linkage("strain"),
    col_linkage_func=lambda w: w.metagenotype.linkage("position"),
)

<seaborn.matrix.ClusterGrid at 0x149b81a10>


!sfacts dump sim.filt.ss-0.fit.world.nc \
    --genotype sim.filt.ss-0.fit.geno.tsv \
    --community sim.filt.ss-0.fit.comm.tsv


!head sim.filt.ss-0.fit.geno.tsv sim.filt.ss-0.fit.comm.tsv

==> sim.filt.ss-0.fit.geno.tsv <==
strain	position	genotype
0	549	0.9594818
0	715	0.98840594
0	603	0.9872556
0	545	0.02570168
0	424	0.9771314
0	646	0.988557
0	438	0.011858768
0	891	0.02623163
0	963	0.9879804

==> sim.filt.ss-0.fit.comm.tsv <==
sample	strain	community
0	0	0.00020524139
0	1	0.0002019747
0	2	0.00012614406
0	3	0.99850655
0	4	2.3415639e-07
0	5	0.0002068418
0	6	0.00016549
0	7	0.00019602856
0	8	0.0001604072

Simple Model Fitting¶