CellDancer benchmark on cell cycle data

CellDancer benchmark on cell cycle data#

Use cellDancer inferred cell specific transcription rate and compare with ground truth.

Note that cellDancer requires anndata == 0.8.0

Library imports#

import celldancer as cd

import scanpy as sc

from rgv_tools import DATA_DIR

Constants#

DATASET = "cell_cycle_rpe1"
SAVE_DATA = True
if SAVE_DATA:
    (DATA_DIR / DATASET / "processed").mkdir(parents=True, exist_ok=True)
    (DATA_DIR / DATASET / "results").mkdir(parents=True, exist_ok=True)

Data loading#

adata = sc.read_h5ad(DATA_DIR / DATASET / "processed" / "adata_processed.h5ad")
cell_type_u_s = cd.adata_to_df_with_embed(
    adata,
    cell_type_para="cell_cycle_phase",
    save_path=DATA_DIR / DATASET / "processed" / "cell_type_u_s_sample_df_processed.csv",
)
100%|██████████| 141/141 [00:02<00:00, 53.48it/s]
loss_df, cellDancer_df = cd.velocity(
    cell_type_u_s, permutation_ratio=0.1, norm_u_s=False, norm_cell_distribution=False, n_jobs=8
)
Using /ictstr01/home/icb/weixu.wang/regulatory_velo/regvelo_revision/cell_cycle_REF/cellDancer_velocity_2025-08-19 17-14-48 as the output path.
Arranging genes for parallel job.
141  genes were arranged to  18  portions.
Velocity Estimation:   0%|          | 0/18 [00:00<?, ?it/s]
Velocity Estimation:   6%|▌         | 1/18 [00:12<03:37, 12.82s/it]
Velocity Estimation:  11%|█         | 2/18 [00:15<01:48,  6.80s/it]
Velocity Estimation:  17%|█▋        | 3/18 [00:18<01:17,  5.16s/it]
Velocity Estimation:  22%|██▏       | 4/18 [00:21<00:58,  4.16s/it]
Velocity Estimation:  28%|██▊       | 5/18 [00:23<00:44,  3.42s/it]
Velocity Estimation:  33%|███▎      | 6/18 [00:26<00:38,  3.20s/it]
Velocity Estimation:  39%|███▉      | 7/18 [00:28<00:32,  2.94s/it]
Velocity Estimation:  44%|████▍     | 8/18 [00:31<00:28,  2.90s/it]
Velocity Estimation:  50%|█████     | 9/18 [00:33<00:24,  2.73s/it]
Velocity Estimation:  56%|█████▌    | 10/18 [00:36<00:22,  2.85s/it]
Velocity Estimation:  61%|██████    | 11/18 [00:39<00:20,  2.93s/it]
Velocity Estimation:  67%|██████▋   | 12/18 [00:42<00:17,  2.91s/it]
Velocity Estimation:  72%|███████▏  | 13/18 [00:45<00:14,  2.94s/it]
Velocity Estimation:  78%|███████▊  | 14/18 [00:48<00:11,  2.86s/it]
Velocity Estimation:  83%|████████▎ | 15/18 [00:54<00:11,  3.82s/it]
Velocity Estimation:  89%|████████▉ | 16/18 [01:02<00:10,  5.14s/it]
Velocity Estimation:  94%|█████████▍| 17/18 [01:10<00:05,  5.84s/it]
Velocity Estimation: 100%|██████████| 18/18 [01:13<00:00,  4.97s/it]
                                                                    
alpha_matrix = cellDancer_df.pivot(index="cellIndex", columns="gene_name", values="alpha")
alpha_matrix.index = adata.obs_names

Save data#

if SAVE_DATA:
    DATA_DIR / DATASET / "processed" / "cell_type_u_s_sample_df_processed.csv"
    alpha_matrix.to_csv(DATA_DIR / DATASET / "processed" / "celldancer_alpha_estimate_processed.csv")