User Manual · Version 1.0

Meet Quyi.
Your Autonomous
Research Intelligence.

A multimodal AI research platform that reads your data, plans the analysis, writes and runs the code, corrects its own errors, reviews its conclusions, and delivers a peer-quality report — all in a single conversation. From bench science to social research to business intelligence.

8
Specialized Agents
158
Example Use Cases
Python + R
Code Languages
Zero
Hallucinated Citations
Live
Real-Time Streaming

Not a chatbot.
A research collaborator.

Quyi is an agentic research and analysis platform designed for researchers, analysts, data scientists, and domain experts across every discipline — natural sciences, social sciences, business intelligence, policy, engineering, and humanities — who need rigorous, reproducible, and thoroughly documented analysis without spending weeks writing code or chasing citations.

🔬

For Research — Any Discipline

Upload your data, describe your research question, and Quyi produces statistically rigorous analysis with verified citations from live academic databases — never fabricated. Natural sciences, social sciences, humanities, health research, economics — Quyi produces documented findings ready for peer review, policy briefs, or stakeholder reports.

📊

For Industry

Feed Quyi your business data — sales, operations, customer, financial — and receive executable analysis code, actionable insights, and boardroom-ready reports generated and validated automatically. From raw data to decision support in a single session.

🖼️

Multimodal

Quyi understands images as fluently as it understands numbers. Upload microscopy images, scientific figures, charts, medical scans, satellite imagery, or product photos and Quyi will analyze them alongside your tabular data as a unified body of evidence.

🧠

Persistent Memory

Every project accumulates knowledge. Quyi remembers your previous analyses, findings, methods, and context. Resume a project months later and it recalls everything — previous results inform new analyses, and your institutional knowledge is never lost.

The Core Difference

Traditional AI assistants suggest. Quyi acts. It doesn't just write code — it runs it, checks the output, fixes errors, validates the methodology, and writes the final report. The difference between a suggestion and a completed, reproducible analysis.

"Quyi collapsed a 3-week analysis pipeline into a focused 2-hour session. The citations were real, the statistics were sound, and the code ran first time."
— Early user, pharmaceutical research team

Built on the right
tools for the job.

Every component in Quyi was chosen for a specific reason. Not trends — capability. Here is what powers the platform and why it matters for your work.

Agents

Coordinated AI Agents — Division of Intelligence

Quyi runs a coordinated team of specialized AI agents, each equipped with its own set of tools — code execution, data reading, statistical libraries, file writing, literature search, and report generation. Rather than a single AI trying to do everything at once, each agent focuses on one responsibility: understanding your goal, planning the approach, writing and running the analysis, catching and fixing errors, critically reviewing the methodology, and documenting the findings. This division of responsibility produces dramatically better results than any single model could achieve alone.

Why this: Specialized agents outperform generalists. An agent that only reviews methodology finds more issues than one that both runs the code and checks it.
RAGBrain

Semantic Document Memory — Find What Matters

When you upload PDFs, papers, or documents, Quyi builds a high-dimensional vector index of their content using state-of-the-art embedding models. This allows agents to retrieve the most relevant passages from your document library at query time — not keyword matching, but deep semantic understanding. A parallel knowledge graph captures relationships between concepts across documents, enabling multi-hop reasoning: connecting a method described in one paper with findings reported in another.

Why this: Sub-millisecond semantic retrieval at any document scale. The graph layer enables reasoning that flat search cannot — "what papers connect this method to that finding?"
KnowledgeGraph

3D Knowledge Graph — How Concepts Connect

Every document you upload is parsed into knowledge triples: structured facts of the form Entity → relation → Entity. These triples are stored in a directed graph and rendered interactively in 3D. When agents answer a question, they walk this graph to surface explicit causal chains that pure text search would miss — following edges from a known concept to related upstream or downstream concepts your documents describe.

Knowledge Graph — Metformin Pharmacology  ·  8 nodes  ·  10 edges  ·  2 source papers
inhibits 📄 paper_1.pdf reduces increases regulates causes impairs activates 📄 paper_2.pdf promotes modulates metformin drug · hub degree: 2 complex I protein · deg: 3 ATP synthesis ROS production AMPK kinase · deg: 2 cell metabolism outcome · hub oxidative stress glucose uptake LEGEND Hub node (high connectivity) Concept / protein node Primary relation Indirect relation Quyi Knowledge Graph Engine · NetworkX + FAISS
Extracted triples (stored in graph)
metformininhibitscomplex I
metforminactivatesAMPK
complex IreducesATP synthesis
complex IincreasesROS production
ATP synthesisregulatescell metabolism
ROS productioncausesoxidative stress
AMPKpromotesglucose uptake
oxidative stressimpairscell metabolism
How the graph helps at query time
1
Query arrives: "How does metformin affect cellular energy?"
2
Entity extraction: key concepts identified → metformin cellular energy
3
Graph walk: all edges connected to those nodes are collected — the path metformin → complex I → ATP synthesis → cell metabolism is surfaced automatically, even if no single passage describes the full chain
4
Context injection: graph facts + relevant text passages are merged and handed to the analysis agent, which now has both structural logic and textual evidence to reason from
Why this matters: pure text search would find passages mentioning metformin and passages mentioning ATP separately — but would not connect them through complex I unless that full chain appeared verbatim in the text. The graph makes multi-hop reasoning explicit.
MemPalace

Persistent Project Memory — Knowledge That Grows

Every project stores persistent memory in a dedicated local palace — a vector database for your analytical findings and a structured knowledge store linking concepts, methods, and conclusions. When you reopen a project after days, weeks, or months, Quyi wakes up its memory and immediately knows where you left off, what you found, what you tried, and what remains unexplored. Projects are fully portable: export as a ZIP archive, share with collaborators, or import on another machine.

Why this: Analysis is iterative. A system that remembers what worked is fundamentally more powerful than one that starts from zero every session.
🏛️
MemPalace Open Source MIT License

The memory and RAG engine powering Quyi's persistent project intelligence is built on MemPalace — an open-source Python library for building long-term AI memory systems with vector search, knowledge graphs, and verified citations. Licensed under the MIT License, it is free to use, modify, and distribute.

Milla Jovovich — Maintainer
Actress & supermodel, and now AI enthusiast  ·  "Multipass!"
Citations

Verified Citations — Zero Hallucination, by Design

Every citation Quyi produces is looked up in real time against CrossRef (200M+ publications) and Semantic Scholar (220M+ papers). If a paper doesn't exist in those databases, Quyi doesn't cite it. Authors, year, journal, DOI — all verified against the live record. The result is a bibliography you can submit to a journal or present to a client without manually checking every reference. This is not a feature — it's a foundational requirement for scientific credibility.

Why this: Fabricated citations have appeared in published academic work and court filings. In scientific and professional contexts, this is unacceptable. Quyi prevents it architecturally.
Multimodal

Image + Data Analysis — One Unified Analysis

Quyi's agents can reason over images and numerical data simultaneously. Upload a fluorescence microscopy image alongside your cell count CSV, or provide a satellite photograph with your geospatial dataset, and the analysis treats both as evidence. Scientific figures, plots, charts, medical images, electron microscopy — all are readable and interpretable by the same agents that analyze your tabular data, producing a unified finding rather than separate analyses that must be reconciled manually.

Why this: Real research is multimodal. The image and the number often tell different parts of the same story. Separating them forces manual synthesis — Quyi does it automatically.
Streaming

Real-Time Execution — Full Transparency

You watch Quyi work in real time. Every agent action streams to your screen as it happens — the analysis plan as it's written, the code as it's generated, the execution output as it runs, the errors as they're caught and fixed, the final report as it's composed. There is no black box. You can intervene, redirect, or simply observe the reasoning process. When a result appears, you know exactly how it was reached and can trust or challenge it with full context.

Why this: Transparency is not optional in scientific and professional work. Seeing the reasoning process lets you catch methodological issues early and builds justified confidence in the results.

What Quyi
can do for you.

A comprehensive overview of Quyi's analytical capabilities across data types, domains, and output formats.

📈
Statistical Analysis
Descriptive statistics, hypothesis testing, ANOVA, regression, correlation, time series, survival analysis, Bayesian inference, non-parametric tests.
🤖
Machine Learning
Classification, regression, clustering, dimensionality reduction, cross-validation, feature importance, hyperparameter tuning, model comparison and explainability.
🖼️
Image Analysis
Scientific figure interpretation, microscopy analysis, medical imaging description, chart digitization, plot pattern recognition, visual comparison across conditions.
📚
Literature Review
Synthesize uploaded papers, extract and compare methodologies, identify research gaps, generate annotated bibliographies with verified citations.
📊
Data Visualization
Publication-quality plots, interactive dashboards, heatmaps, network graphs, survival curves, geographic maps — all saved automatically to your project.
🧬
Bioinformatics
Gene expression analysis, pathway enrichment, sequence alignment, protein structure reasoning, GWAS, phylogenetics, single-cell analysis.
💰
Financial Analysis
Time series forecasting, risk modeling, portfolio analysis, anomaly detection, earnings analysis, regulatory reporting, credit scoring.
🗺️
Geospatial Analysis
Spatial statistics, geographic clustering, satellite image interpretation, environmental monitoring, route optimization, spatial regression.
💬
Text & NLP
Sentiment analysis, topic modeling, named entity recognition, corpus analysis, survey coding, semantic search across document collections.
📄
Report Generation
Full structured reports with methods, results, discussion, limitations, and verified citations — ready for journals, regulators, or executive stakeholders.
🔁
Iterative Refinement
Ask follow-up questions in plain language after results appear. Request deeper dives, parameter changes, or alternative approaches — all in context of prior work.
📦
Portable Projects
Export any project as a ZIP — memory, reports, code, citations included — and re-import elsewhere or share with collaborators. Storage you control, data you own.

From data to insight
in five steps.

Quyi is designed to be powerful without being complex. Here is the complete workflow from first login to finished findings — whether that's a journal paper, a policy brief, a business report, or a research dataset.

1

Create or Open a Project

Click the project selector in the sidebar and give your project a name — gene_expression_study, q4_revenue_analysis, climate_dataset_2026. A persistent workspace is created with its own memory, citation library, and full history of analyses. To continue previous work, open the project and your last session is fully restored — goal, plan, code, results, and chat history all exactly as you left them.

2

Upload Your Data, Documents, and Images

Use the three upload panels in the setup screen. Memory Bank is for background documents — papers, reports, methodology guides — that inform the analysis without being analyzed directly. Images for Analysis accepts any visual data: microscopy images, charts, scientific figures, medical scans. Workspace Data is for the tabular datasets Quyi will actually run code against — CSV, Excel, TSV. All uploads are processed immediately and become available to every agent in the pipeline.

3

State Your Research Goal

Write your goal in plain language — but be specific. The more precisely you define what you want, the more targeted the analysis. Instead of "analyze my data", write "Compare gene expression between treated and control groups using DESeq2-equivalent statistics, apply FDR correction, and generate a volcano plot annotating the top 20 differentially expressed genes." Quyi reads your goal, your uploaded data, and your documents together before planning anything.

4

Review the Plan and Execute

Before running any code, Quyi presents an analysis blueprint — the proposed methodology, statistical approach, expected outputs, and required libraries. Review it. If the approach isn't what you intended, describe the changes before proceeding. Once you execute, the live console streams every action in real time: libraries being installed, code being written, the script running, errors being caught and fixed autonomously, and the final report being composed. Nothing is hidden.

5

Review Results and Continue the Conversation

Your results appear as a structured report with the analysis code, visualizations, and verified citations. This is the beginning of the conversation, not the end. Use the Chat button to ask follow-up questions, request deeper breakdowns of specific findings, explore unexpected results, or ask Quyi to explain its methodology in plain language. Use Refine to extend the analysis with new parameters or additional questions. Every finding is saved to project memory and available in future sessions.

💡 Pro Tip — Use the Citations Panel Before You Run

Before executing analysis, use the Verified Citations panel to search for relevant papers on your topic. Save the ones that matter — Quyi will incorporate them as methodological anchors when composing the final report, producing literature-grounded conclusions rather than generic statistical summaries.

158 ways researchers,
analysts, and teams use Quyi.

Real-world scenarios showing how Quyi accelerates work across academia, industry, social sciences, public policy, and more. Each example includes the context, what Quyi produces, and a prompt you can use directly.

🎓 Academia Research & Science
01
Differential Gene Expression Analysis
A molecular biology lab has RNA-seq count data from a drug treatment experiment — 6 treated and 6 control samples across 20,000 genes. They need to identify which genes respond to the treatment and visualize the magnitude and significance of those changes.
Quyi applies negative binomial modeling to the count data, estimates log₂ fold-changes, performs Wald tests, and corrects for multiple comparisons using the Benjamini-Hochberg procedure. It generates a volcano plot annotating the top 20 significant genes, an MA plot, and a heatmap of the top 50 differentially expressed genes across all samples — all saved to the project with a methods section citing foundational DEG literature.
Perform differential expression analysis on my RNA-seq counts CSV (6 treated, 6 control). Apply BH-FDR correction at 0.05, generate a volcano plot annotating top-20 genes, an MA plot, and a clustered heatmap of the top 50 DEGs. Report log2FC and adjusted p-values in a results table.
GenomicsStatisticsVisualization
02
Long-Term Climate Trend Detection
A climate scientist has 60 years of monthly temperature anomaly data for a region and needs to formally test whether a warming trend exists, decompose its seasonal component, and compare the observed trend to IPCC benchmarks from uploaded reference documents.
Quyi applies Mann-Kendall non-parametric trend testing, estimates the Sen's slope for trend magnitude, decomposes the time series using STL to separate seasonal, trend, and residual components, and fits a linear regression with confidence intervals. It compares the estimated decadal trend to the IPCC benchmark values retrieved from the uploaded papers and flags whether the regional trend exceeds the global average.
Analyze my 60-year monthly temperature anomaly dataset. Run a Mann-Kendall trend test and Sen's slope estimate, perform STL decomposition, plot the trend with 95% CI, and compare the decadal warming rate to the IPCC AR6 values in my uploaded reference document.
Climate ScienceTime Series
03
Systematic Literature Review Synthesis
A PhD student has collected 35 papers on CRISPR therapeutic applications and needs to synthesize them into a structured review — comparing methodologies, identifying where findings converge and diverge, and producing a summary table ready for the review paper.
Quyi reads all uploaded papers through the RAG system, groups them by therapeutic target and delivery mechanism, extracts reported efficacy metrics and adverse events from each study, constructs a comparison matrix, identifies three areas of methodological disagreement in the literature, and generates a formatted summary table with DOI-verified citations for every claim. The output is structured to slot directly into a manuscript methods and results section.
I've uploaded 35 papers on CRISPR gene editing therapies. Synthesize their key findings by therapeutic target, compare reported efficacy metrics and adverse event profiles, identify areas of methodological disagreement, and generate a structured summary table with citations. Format the output for direct inclusion in a systematic review manuscript.
Literature ReviewBiotechnology
04
Clinical Trial Survival Analysis Report
A clinical researcher has Phase II trial data with 180 patients in two arms and needs to report primary and secondary endpoints for a regulatory submission, including survival curves, hazard ratios, and a fully formatted clinical results section.
Quyi fits Kaplan-Meier curves for both arms, runs the log-rank test for significance, fits a Cox proportional hazards model with covariates, calculates hazard ratios and 95% confidence intervals for primary and secondary endpoints, tests the proportional hazards assumption, and generates a results section written in ICH E9 statistical reporting style with all values properly formatted for regulatory submission.
Analyze my Phase II trial dataset. Produce Kaplan-Meier curves with risk tables for both arms, run the log-rank test, fit a Cox PH model with age and baseline score as covariates, and report HR with 95% CI for both primary and secondary endpoints. Write a regulatory-style results section in ICH E9 format.
Clinical ResearchSurvival AnalysisRegulatory
05
Fluorescence Microscopy Quantification
A cell biology lab has 18 fluorescence microscopy images — 9 control and 9 treated — and needs to quantify whether the treatment produces statistically significant changes in cell morphology, including nuclear area and membrane irregularity.
Quyi analyzes each uploaded image, describes the observable morphological features, estimates relative nuclear area and shape index from intensity distributions, computes descriptive statistics across both groups, runs a Mann-Whitney U test for each morphological measure, and produces a comparative summary with representative images annotated and a formal statistical report with effect sizes and confidence intervals.
I'm uploading 18 fluorescence microscopy images — 9 control, 9 treated. Analyze nuclear morphology across both groups, estimate nuclear area and shape irregularity, compare groups with Mann-Whitney U tests, report effect sizes, and generate annotated representative images for each condition.
Cell BiologyImage Analysis
06
Psychological Meta-Analysis with Forest Plot
A psychologist is pooling effect sizes from 18 independent RCTs on mindfulness-based therapy for generalized anxiety disorder and needs a rigorous meta-analysis with heterogeneity assessment and publication bias testing for journal submission.
Quyi fits a random-effects meta-analytic model using restricted maximum likelihood, calculates the pooled effect size with 95% CI and 95% prediction interval, computes I² and Cochran's Q for heterogeneity, generates a publication-quality forest plot with individual study CIs and the pooled estimate, runs Egger's test for funnel plot asymmetry, and produces a PRISMA-aligned methods section with citations to foundational meta-analytic methodology papers.
Run a random-effects meta-analysis on my 18-study dataset (effect sizes, sample sizes, SEs) for mindfulness therapy on GAD. Report pooled SMD with 95% CI and prediction interval, I² and Cochran's Q, generate a forest plot and funnel plot, run Egger's test, and write a PRISMA-aligned methods section.
PsychologyMeta-Analysis
07
Epidemiological SIR Outbreak Modeling
An epidemiologist has 16 weeks of daily reported case data from an outbreak and needs to estimate the basic reproduction number R₀, project peak timing, and quantify uncertainty in those estimates for a public health briefing.
Quyi fits a discrete-time SIR compartmental model to the case counts using nonlinear least squares, estimates R₀ with bootstrap confidence intervals, runs an MCMC uncertainty analysis to propagate parameter uncertainty into projections, generates an epidemic curve with observed data and model fit overlaid, projects the epidemic trajectory with 80% and 95% credible intervals, and writes a plain-language public health summary alongside the technical methods.
Fit an SIR model to my 16-week daily case count data. Estimate R₀ with 95% CI via bootstrap, run MCMC to quantify projection uncertainty, generate epidemic curves with credible intervals, project peak timing and magnitude, and write a plain-language briefing alongside the technical methods.
EpidemiologyModelingPublic Health
08
Ecological Species Distribution Model
A conservation biologist has GPS occurrence records for an endangered amphibian species and 8 bioclimatic rasters and needs to model current habitat suitability and project how climate change will shift the species range under two emissions scenarios.
Quyi builds a MaxEnt-equivalent species distribution model using presence-background methods, evaluates model performance with AUC and boyce index, generates a current suitability map, projects suitability under RCP 4.5 and 8.5 scenarios from the uploaded climate data, calculates the percentage of current suitable habitat projected to be lost under each scenario, and produces a conservation-oriented report with recommendations for protected area prioritization.
Build a species distribution model for my amphibian occurrence records using 8 bioclimatic variables. Evaluate with AUC and Boyce index, generate a current suitability map, project under RCP 4.5 and 8.5, calculate projected habitat loss under each scenario, and produce a conservation management report.
EcologyConservationClimate
09
fMRI Activation and Connectivity Analysis
A neuroscientist has pre-processed fMRI activation data from 28 subjects performing a working memory task and needs to identify significant activation clusters, report peak coordinates, and examine functional connectivity between regions of interest.
Quyi runs a second-level mixed-effects GLM on the contrast maps, applies cluster-based family-wise error correction, reports peak activation coordinates in MNI space with cluster sizes and Z-scores, extracts ROI time series for functional connectivity analysis, computes a connectivity matrix, and produces an activation table and connectivity figure formatted for neuroimaging journal submission alongside interpretive commentary citing relevant functional neuroanatomy literature.
Analyze my fMRI second-level contrast data from 28 subjects. Run mixed-effects GLM, apply cluster-FWE correction, report peak MNI coordinates and cluster sizes, extract ROI time series for functional connectivity analysis, generate a connectivity matrix, and produce an activation table formatted for NeuroImage submission.
NeuroscienceNeuroimagingConnectivity
10
Genome-Wide Association Study
A plant geneticist has a wheat diversity panel of 320 accessions genotyped at 55,000 SNPs and phenotyped for drought tolerance across two growing seasons, and needs to identify genomic loci associated with tolerance while controlling for population structure.
Quyi runs a mixed linear model GWAS with kinship matrix correction to control for relatedness and population stratification, generates a Manhattan plot with genome-wide and suggestive significance thresholds, produces a quantile-quantile plot to assess model calibration, annotates the top associated SNPs to nearby candidate genes, and writes a genetics results section discussing the biological plausibility of the identified loci in the context of drought tolerance mechanisms described in the uploaded papers.
Run a GWAS on my wheat panel (320 accessions, 55,000 SNPs) for drought tolerance phenotype. Use MLM with kinship correction, generate Manhattan and QQ-plots, annotate top SNPs to candidate genes within 50kb, and write a genetics results section discussing the biological significance of top loci.
Plant GenomicsGWASStatistics
11
Structural Equation Modeling for Survey Data
A social scientist has Likert-scale survey data on institutional trust from 2,600 respondents across 5 demographic groups and needs to test a structural model of trust predictors while verifying that the measurement instrument performs consistently across groups.
Quyi runs a confirmatory factor analysis to verify the 5-factor measurement model, tests configural, metric, and scalar measurement invariance across demographic groups using chi-square difference tests, fits the full structural equation model estimating causal paths between predictors and institutional trust, reports standardized path coefficients with 95% CIs, fit indices (CFI, TLI, RMSEA, SRMR), and produces path diagrams for both the measurement and structural models in publication-ready format.
Run CFA and full SEM on my 5-factor institutional trust survey data (n=2,600). Test configural, metric, and scalar invariance across 5 demographic groups, fit the structural model estimating predictors of trust, report standardized path coefficients, fit indices, and generate path diagrams for manuscript submission.
Social ScienceSEMPsychometrics
12
Causal Inference with Difference-in-Differences
An economist studying the effect of a regional minimum wage increase on employment needs to estimate the causal effect using a natural experiment design, controlling for pre-existing differences between treated and comparison counties.
Quyi selects matched control counties using propensity score matching on pre-treatment covariates, tests the parallel pre-trends assumption using an event study plot with coefficients for each pre-treatment period, runs the difference-in-differences model with two-way fixed effects, reports the average treatment effect on the treated (ATT) with clustered standard errors, and tests for heterogeneous treatment effects by industry sector — producing a complete econometrics results section with robustness checks.
Run a DiD analysis on my county-level employment data around the minimum wage policy change. Match control counties via propensity score matching, test parallel trends with an event study, run TWFE DiD, report ATT with clustered SEs, test for heterogeneous effects by industry, and include robustness checks. Cite Card and Krueger and relevant modern papers.
EconomicsCausal InferencePolicy
13
Bioinformatics Pathway Enrichment
A bioinformatician has a list of 847 differentially expressed genes from a cancer study and needs to identify overrepresented biological pathways and gene ontology terms to understand the mechanistic basis of the observed expression changes.
Quyi runs both over-representation analysis (ORA) using Fisher's exact test and gene set enrichment analysis (GSEA) against GO Biological Process, KEGG, and Reactome databases, applies Benjamini-Hochberg correction, generates a bubble plot of the top 20 enriched terms for each database, constructs a network visualization showing pathway crosstalk, identifies three biologically coherent themes in the enrichment results, and writes a biological interpretation section grounding the findings in known cancer biology from the uploaded literature.
Run ORA and GSEA pathway enrichment on my 847-gene DEG list against GO-BP, KEGG, and Reactome. Apply BH correction, generate bubble plots for top-20 terms per database, visualize pathway network crosstalk, identify the main biological themes, and write an interpretation section citing relevant cancer biology literature.
BioinformaticsPathway AnalysisCancer Biology
14
Material Fatigue Characterization
A materials engineer has cyclic loading test data for three aerospace aluminum alloy variants and needs to generate S-N curves, estimate endurance limits, and determine which alloy meets certification standards for a specific application.
Quyi fits Basquin's power law to the S-N data for each alloy using nonlinear regression, estimates the fatigue strength coefficient and exponent with confidence intervals, calculates the 10⁷-cycle endurance limit with uncertainty bounds for each alloy, compares alloy performance using ANCOVA, flags which alloys meet the applicable aerospace fatigue standard from the uploaded specification document, and generates a combined S-N curve plot with all three alloys and their confidence bands.
Fit S-N curves to my fatigue test data for three aluminum alloy variants using Basquin's law. Report fatigue strength coefficients and exponents with CIs, calculate 10⁷-cycle endurance limits, compare alloys with ANCOVA, and flag which variants meet the fatigue requirements in my uploaded specification document.
Materials ScienceAerospaceEngineering
15
Environmental Pollutant Trend Analysis
An environmental chemist has 22 years of quarterly heavy metal concentration data from river sediment samples across 14 stations near an industrial area and needs to assess whether remediation efforts since 2015 have produced measurable improvements.
Quyi runs a change-point detection analysis to identify significant breaks in the concentration time series around the 2015 remediation date, applies Mann-Kendall trend tests separately for pre- and post-remediation periods, calculates geoaccumulation indices and enrichment factors to contextualize absolute concentrations, compares industrial stations to upstream reference stations using permutation tests, assesses exceedance of WHO sediment quality guidelines before and after remediation, and generates a remediation effectiveness report with temporal trend plots for each station.
Analyze my 22-year quarterly heavy metal dataset (Pb, Cd, Hg, As) from 14 stations. Run change-point detection around 2015, apply Mann-Kendall trends pre- and post-remediation, calculate geoaccumulation indices, compare industrial vs. reference stations with permutation tests, assess WHO guideline exceedance, and generate a remediation effectiveness report.
Environmental ScienceRemediationChemistry
🏢 Industry Business & Operations
16
Multi-Horizon Sales Forecasting
A retail operations team has three years of weekly SKU-level sales data across 6 regions and needs a 12-month forward forecast broken down by product category, with error bounds, to inform inventory purchasing and capacity planning decisions.
Quyi builds a hierarchical forecasting model with additive decomposition of trend, seasonality, and holiday effects, incorporates provided promotion calendars and macroeconomic indicators as external regressors, generates 12-month forecasts for each SKU-region combination with 80% and 95% prediction intervals, reports MAPE and WAPE by product category, identifies the 10 SKUs with highest forecast uncertainty, and produces a procurement planning summary ranked by projected demand growth.
Forecast the next 12 months of weekly SKU-level sales from my 3-year dataset across 6 regions. Use a hierarchical model with holiday effects, promotion dummies, and CPI as regressor. Report MAPE and WAPE by category, generate prediction intervals, identify high-uncertainty SKUs, and produce a procurement planning summary.
RetailForecastingInventory
17
Customer Churn Prediction and Revenue Impact
A SaaS company has 2 years of customer behavioral data including login frequency, feature usage, support tickets, and contract value, and needs to identify customers at risk of churning in the next 90 days and quantify the revenue at risk.
Quyi engineers behavioral features from the raw logs, trains and compares logistic regression, gradient boosting, and neural network models using stratified cross-validation, uses SHAP values to produce an explainability profile identifying the top 10 churn drivers, generates a calibrated churn probability score for every active customer, segments customers into high/medium/low risk tiers, calculates the ARR at risk in each tier, and produces a ranked intervention list with estimated revenue impact of retention actions.
Build a 90-day churn prediction model on my SaaS customer dataset. Compare logistic regression, XGBoost, and neural network. Use SHAP for top-10 churn driver analysis, generate calibrated risk scores for all active customers, segment into risk tiers, calculate ARR at risk per tier, and produce a prioritized retention intervention list.
SaaSChurnRevenue
18
Manufacturing Statistical Process Control
A precision manufacturing plant has 8 months of sensor data from a CNC machining line measuring 6 critical dimensions and needs to determine whether processes are in control, calculate process capability indices, and identify which parameters predict defect events.
Quyi generates Xbar-R and individual-moving-range control charts for each dimension with Nelson rule violation flags, calculates Cp, Cpk, and Ppk capability indices for each characteristic with uncertainty estimates, runs CUSUM charts to detect process drift, performs logistic regression to identify which process parameters are most predictive of out-of-tolerance events, and produces an SPC report summarizing process health, out-of-control periods, and recommended process adjustments.
Run full SPC analysis on my 8-month CNC machining sensor data for 6 critical dimensions. Generate Xbar-R and CUSUM charts with Nelson rule flags, calculate Cp, Cpk, Ppk, identify OOT-predictive parameters via logistic regression, and produce an SPC health report with recommended process adjustments.
ManufacturingQualitySPC
19
Portfolio Value-at-Risk for Regulatory Reporting
A risk management team at an asset manager needs to calculate daily and 10-day VaR for a multi-asset portfolio for regulatory capital reporting, using multiple methodologies and backtesting results to satisfy internal model approval requirements.
Quyi calculates 1-day and 10-day VaR and CVaR at both 95% and 99% confidence levels using historical simulation, parametric Gaussian, and 10,000-path Monte Carlo methods, runs Kupiec's unconditional coverage test and Christoffersen's conditional coverage test for each model, generates a traffic-light backtesting report, calculates model diversification benefits, and produces a regulatory capital report formatted according to Basel III internal model requirements with full methodology documentation.
Calculate 1-day and 10-day VaR and CVaR at 95% and 99% using historical simulation, parametric Gaussian, and 10k-path Monte Carlo for my equity portfolio. Run Kupiec and Christoffersen backtests, generate a traffic-light report, and produce a Basel III regulatory capital report with full methodology documentation.
FinanceRiskRegulatory
20
Predictive Maintenance from Sensor Data
An industrial equipment operator has 9 months of vibration and temperature sensor data from 45 pumps, including 12 recorded failure events, and needs a model that predicts failure at least 48 hours in advance to reduce emergency repair costs.
Quyi engineers both time-domain (RMS, kurtosis, crest factor) and frequency-domain (dominant frequencies, spectral entropy) features from the raw sensor signals, addresses the severe class imbalance using SMOTE and class weighting, trains and evaluates LSTM and gradient boosting classifiers using precision-recall AUC on the imbalanced test set, generates a prediction horizon analysis showing model accuracy at 24h, 48h, and 72h ahead, and produces a business case calculation comparing predicted maintenance cost savings against the current reactive repair baseline.
Build a predictive maintenance model from my 9-month vibration and temperature sensor data for 45 pumps with 12 failure events. Engineer time-domain and frequency-domain features, handle imbalance with SMOTE, train LSTM and XGBoost, evaluate at 24h/48h/72h prediction horizons with precision-recall AUC, and produce a business case comparing savings vs. reactive maintenance baseline.
Industrial IoTPredictive Maintenance
21
30-Day Hospital Readmission Risk Model
A hospital quality improvement team has discharge records for 4,200 heart failure admissions and needs a readmission risk model that clinicians can use at the point of discharge to prioritize post-discharge follow-up calls and interventions.
Quyi preprocesses the clinical dataset, handles missing values using multiple imputation, trains XGBoost and logistic regression models with SMOTE oversampling, calibrates probabilities using Platt scaling, evaluates discrimination (AUROC), calibration (Hosmer-Lemeshow), and clinical utility (decision curve analysis), generates SHAP force plots for individual patient explanation, and produces a clinical decision support summary identifying the 5 most actionable risk factors that a care team can intervene on before discharge.
Build a 30-day readmission risk model for my heart failure discharge dataset (n=4,200). Use multiple imputation for missing data, train XGBoost and logistic regression with SMOTE, calibrate with Platt scaling, evaluate with AUROC, Hosmer-Lemeshow, and decision curve analysis, generate SHAP force plots, and produce a clinical decision support summary with the 5 most actionable pre-discharge risk factors.
HealthcareClinical MLDecision Support
22
Insurance Claims Fraud Detection
An insurance analytics team has 18 months of motor claims data and needs an anomaly detection system that flags suspicious claims for investigation while keeping false positive rates low enough to avoid disrupting legitimate claimant experience.
Quyi engineers behavioral and network features including claim frequency per policyholder, garage relationships, and timing patterns relative to policy inception, trains Isolation Forest and Autoencoder anomaly detectors on the fraud-free claim history, tunes detection thresholds to achieve a false positive rate below 5%, generates an investigation priority queue ranked by anomaly score, and produces a fraud signature profile identifying the 8 behavioral patterns most discriminating between flagged and clean claims — enabling the investigation team to quickly triage the queue.
Build a fraud detection system on my 18-month motor claims dataset. Engineer behavioral and network features, train Isolation Forest and Autoencoder models, tune thresholds for <5% FPR, generate an investigation priority queue ranked by anomaly score, and produce a fraud signature report identifying the top 8 behavioral patterns of suspicious claims.
InsuranceFraud DetectionAnomaly
23
A/B Test Analysis with Multiple Metrics
A product team ran a 4-week experiment on a checkout flow redesign with 24,000 users split evenly and needs a statistically rigorous analysis of conversion rate and 5 secondary metrics before deciding whether to ship the change to 100% of users.
Quyi runs frequentist hypothesis tests for each metric with Bonferroni-Holm correction for multiple comparisons, conducts a pre-hoc power analysis to confirm the test was adequately powered, tests for heterogeneous treatment effects across device type and user tenure, runs a Simpson's paradox check across key segments, calculates the practical significance of each metric difference in revenue terms, and produces an experiment readout document summarizing the decision recommendation with supporting statistical evidence and confidence levels.
Analyze my A/B test results (n=12,000 per arm) for checkout redesign. Test conversion rate and 5 secondary metrics with Bonferroni-Holm correction, run power analysis, test for heterogeneous effects by device and user tenure, check for Simpson's paradox, calculate revenue impact of each significant difference, and produce an experiment readout with a ship/no-ship recommendation.
ProductExperimentationStatistics
24
Renewable Energy Generation Forecasting
An energy trading desk operates a portfolio of 6 wind and solar assets and needs 72-hour ahead generation forecasts to optimize dispatch scheduling and reduce imbalance penalties in the day-ahead electricity market.
Quyi builds separate forecasting models for wind (power curve regression with atmospheric stability correction) and solar assets (clear-sky model with cloud cover adjustment), trains gradient boosting models on 18 months of historical SCADA output matched to NWP weather forecasts, evaluates with normalized RMSE and skill score versus persistence baseline, generates 72-hour probabilistic forecasts with 10/50/90 percentile bands for each asset, and produces an operational dispatch report with aggregate portfolio generation scenarios for the trading team.
Build 72-hour generation forecasts for my 6-asset renewable portfolio (3 wind, 3 solar) using SCADA history and NWP weather data. Separate wind power curve and solar clear-sky models, evaluate with nRMSE vs. persistence baseline, generate P10/P50/P90 probabilistic forecasts, and produce an operational dispatch report with aggregate portfolio scenarios.
EnergyForecastingTrading
25
ESG Carbon Footprint and Disclosure Report
A sustainability team needs to calculate the company's Scope 1, 2, and 3 greenhouse gas emissions from activity data across 8 business units, benchmark against sector peers, and produce a disclosure report aligned with regulatory requirements and investor expectations.
Quyi calculates GHG emissions across all three scopes using GHG Protocol emission factors applied to the activity data, breaks down contributions by business unit and emission category, calculates intensity metrics (tCO₂e per unit revenue, per employee, per unit output), compares intensity against sector benchmark data from the uploaded industry report, identifies the top 5 emission reduction opportunities ranked by abatement potential and implementation feasibility, and generates a TCFD-aligned climate disclosure report with scenario analysis narrative for two temperature pathways.
Calculate Scope 1, 2, and 3 GHG emissions from my activity data across 8 business units using GHG Protocol factors. Compute intensity metrics (tCO₂e/revenue, tCO₂e/employee), benchmark against sector peers in my uploaded report, identify the top 5 reduction opportunities, and generate a TCFD-aligned disclosure report with 1.5°C and 2°C scenario narratives.
ESGSustainabilityRegulatory
📊 Finance & Accounting Financial Analysis & Reporting
26A
Accounts Receivable Aging & DSO Analysis
A mid-size B2B company's finance team needs to assess the health of its receivables portfolio — identifying slow-paying customers, projecting bad debt exposure under IFRS 9 expected credit loss rules, and tracking whether collections efficiency is improving or deteriorating over time.
Quyi simulates 3 years of AR transaction data for 500 customers, computes Days Sales Outstanding month-by-month, constructs aging buckets (0–30, 31–60, 61–90, 90+ days overdue), builds a customer-level risk score from payment history, and applies a simplified ECL model to estimate the provision required at each reporting date. It generates aging heatmaps by customer segment, a DSO trend line with 12-month moving average, and a ranked watchlist of high-risk accounts sorted by exposure and scoring.
Simulate 3 years of AR data for a mid-size B2B company (500 customers). Calculate DSO monthly, build aging buckets (0–30, 31–60, 61–90, 90+ days), score customers by payment risk, apply an IFRS 9 ECL model to project bad debt provision, and generate aging heatmaps, DSO trend lines, and a ranked high-risk account watchlist.
AccountingIFRS 9Credit Risk
27A
Budget vs Actual Variance Analysis
A CFO needs a monthly management reporting pack that explains exactly why actual P&L results deviated from budget across 8 cost centers — splitting each variance into its volume, price, and mix drivers — and flags the exceptions that require executive attention.
Quyi simulates 12 months of budget vs actual data across 8 cost centers and a full P&L structure, decomposes revenue variances into volume, price, and mix effects and cost variances into spending and efficiency components, applies a 10% materiality threshold to automatically flag significant exceptions, and builds a ranked exception list sorted by absolute variance value. Output includes management-style waterfall charts for both revenue and cost, a heat map of variance intensity by cost center and month, and a formatted executive commentary section.
Simulate a full P&L with 8 cost centers and 12 months of budget vs actual data. Decompose revenue variances into volume, price, and mix; cost variances into spending and efficiency. Flag exceptions exceeding 10%. Produce waterfall charts, a variance heatmap by cost center and month, and a ranked exception list with executive commentary.
AccountingManagement ReportingVariance Analysis
28A
Break-Even & Contribution Margin Analysis
A management accountant at a multi-product manufacturer needs to know which product lines are actually covering their costs, what volume is needed to break even on each, and how sensitive total profit is to demand shifts — before presenting to the board.
Quyi builds a full contribution margin model for 5 product lines, computing variable cost per unit, contribution margin per unit and as a percentage of revenue, fixed cost absorption per product, and product-level break-even volumes. It then calculates the blended weighted-average contribution margin, the overall company break-even, and the margin of safety. A sensitivity analysis varies volume ±20% in 5% steps and maps the effect on total operating profit for each product. Output includes a CVP chart per product and a sensitivity tornado chart.
Model a multi-product manufacturer (5 product lines) with fixed and variable cost structures. Calculate contribution margins (unit and %), break-even volumes and revenue per product, blended CM, margin of safety, and operating leverage. Simulate ±20% volume sensitivity and plot CVP charts for each product and a sensitivity tornado chart.
AccountingCVP AnalysisManagement Accounting
29A
13-Week Cash Flow Forecast & Liquidity Monitoring
A treasury analyst at a manufacturing company needs to maintain a rolling 13-week cash flow forecast, compare weekly actuals against the forecast to understand what is driving deviations, and monitor whether cash balances are staying above the minimum $500K covenant threshold.
Quyi builds a 13-week direct cash flow model segmented into operating receipts, supplier payments, payroll, capex, debt service, and financing flows, computes weekly net cash movement and running balance, generates weekly actual-vs-forecast variance decomposed by category, flags any week where projected balance falls below $500K, and calculates a cash conversion efficiency metric. Visualization includes an inflow/outflow waterfall per week, a cash runway chart with the covenant floor overlaid, and a category-level variance summary table.
Build a 13-week rolling cash flow model for a manufacturer. Include operating, investing, and financing activities. Compare actuals vs. forecast weekly, decompose variances by category, flag weeks where balance drops below $500K covenant minimum, and visualize weekly inflow/outflow waterfalls and a cash runway chart with the covenant floor.
TreasuryCash ManagementForecasting
30A
Inventory Valuation: FIFO vs Weighted Average vs LIFO
A retail company operating in an inflationary environment wants to understand how its choice of inventory cost-flow assumption will affect reported gross margin, balance sheet inventory values, and tax liability across 24 months — before deciding which method to adopt.
Quyi simulates 24 months of purchase and sales transactions for 10 SKUs under rising input costs, applies FIFO, weighted-average, and LIFO cost-flow assumptions to each SKU independently, and computes COGS, ending inventory value, and gross margin for each month and method. It then calculates the cumulative tax difference assuming a 25% rate and runs a sensitivity on inflation rate from 2% to 8%. Output includes margin divergence line charts by method, a comparative P&L summary table, and a tax impact bar chart across inflation scenarios.
Simulate 24 months of inventory transactions for a retail company with 10 SKUs under inflationary conditions. Apply FIFO, weighted-average, and LIFO assumptions. Compare COGS, gross margin, and ending inventory by method monthly. Compute cumulative tax impact differences at a 25% rate and plot margin divergence, P&L comparison tables, and an inflation sensitivity chart.
AccountingInventoryTax
31A
Fixed Asset Depreciation & IAS 36 Impairment Testing
An asset-intensive company needs to produce its annual depreciation schedules for 100 assets across mixed useful lives, compare the P&L impact of three depreciation methods, and identify which assets require an impairment write-down based on recoverable amount testing.
Quyi builds a full asset register with 100 assets across 6 asset classes, computes annual and cumulative depreciation under straight-line, double-declining balance, and units-of-production methods, generates the full depreciation schedule table for each method, and applies an IAS 36 impairment test to 20 assets flagged as underperforming — comparing value-in-use (discounted cash flow) and fair value less costs of disposal against carrying amount. Output includes a comparative depreciation expense chart by method and year, a bar chart of impairment charges by asset class, and a formatted board-ready impairment note.
Model a capital-intensive company with 100 assets across 6 asset classes. Compare straight-line, double-declining balance, and units-of-production depreciation. Run an IAS 36 impairment test (VIU via DCF vs. carrying amount) for 20 underperforming assets. Output a full depreciation schedule, comparative expense chart by method and year, and a formatted impairment note.
AccountingIFRSImpairment
32A
Revenue Recognition Under ASC 606 — SaaS Contracts
A SaaS company's accounting team is implementing ASC 606 for 200 multi-element contracts bundling software licenses, annual support, and implementation services. They need to allocate transaction prices to each performance obligation and model the resulting revenue recognition pattern.
Quyi applies the five-step ASC 606 model to each contract: identifies distinct performance obligations (license, support, implementation), allocates the transaction price using relative standalone selling prices, determines whether each obligation is satisfied at a point-in-time or over time, and computes monthly recognized revenue, contract assets, and deferred revenue for each contract over its life. It aggregates to a company-level monthly waterfall and produces a contract liability roll-forward schedule. Output includes monthly revenue waterfall charts, a deferred revenue balance trend, and an annotated sample contract walk-through.
Model a SaaS company with 200 bundled contracts (license + support + implementation). Apply ASC 606: identify performance obligations, allocate transaction price by SSP, determine recognition timing (point-in-time vs. over time), and compute monthly recognized revenue, contract assets, and deferred revenue. Produce monthly revenue waterfall charts, a contract liability roll-forward, and an annotated sample contract walk-through.
AccountingASC 606Revenue Recognition
33A
Lease Accounting Under IFRS 16
A retail chain with 50 property leases of varying terms and rates is transitioning to IFRS 16 and needs to calculate right-of-use assets and lease liabilities at inception, produce amortization schedules for each lease, and compare the P&L impact against the previous straight-line operating lease treatment.
Quyi processes all 50 lease contracts, computes present values of future lease payments at the incremental borrowing rate for each, records initial ROU assets and lease liabilities, separates each payment into interest and principal components, and generates the full amortization schedule over each lease term. It then compares P&L under IFRS 16 (depreciation + interest) vs the old IAS 17 straight-line rental charge, quantifying the front-loading effect and EBITDA uplift. Output includes liability unwinding curves, ROU asset depreciation schedules, a cumulative P&L comparison chart, and a transition-date balance sheet impact table.
Model a retail company with 50 lease contracts (varying terms, rates, and payment schedules). Calculate ROU assets and lease liabilities at inception, produce full amortization schedules splitting interest vs. principal, and compare P&L under IFRS 16 vs. prior operating lease treatment. Plot liability unwinding curves, ROU depreciation, cumulative P&L comparison, and transition-date balance sheet impact.
AccountingIFRS 16Leases
34A
Working Capital Cycle & Cash Conversion Analysis
A distribution company's CFO wants to understand whether the company is managing its working capital efficiently, where the cash conversion cycle is losing time compared to industry norms, and what the free cash flow impact would be from targeted improvements.
Quyi simulates 36 months of balance sheet data, computes DIO (Days Inventory Outstanding), DSO, and DPO month by month and derives the cash conversion cycle as DIO + DSO − DPO. It identifies seasonal peaks in working capital demand, benchmarks the CCC against the industry median, and models the incremental free cash flow released by a 5-day DSO improvement and a 3-day DPO extension. Output includes a CCC decomposition line chart, a working capital efficiency heatmap by month and component, a seasonal peak chart, and a free cash flow sensitivity table.
Simulate 36 months of balance sheet data for a distribution company. Calculate DIO, DSO, DPO, and CCC monthly. Identify seasonal working capital peaks, benchmark CCC against industry median, model free cash flow impact of a 5-day DSO reduction and 3-day DPO extension, and plot a CCC decomposition chart and working capital efficiency heatmap.
TreasuryWorking CapitalCash Flow
35A
Payroll Cost Allocation & Headcount Analytics
An HR finance team needs to understand total compensation cost by department over 24 months, track headcount movements including hires, terminations, and transfers, and quantify the cost impact of turnover to support workforce planning decisions.
Quyi simulates payroll data for 200 employees across 6 departments over 24 months, allocating costs into base salary, overtime, benefits, employer taxes, and bonus components. It tracks monthly headcount movements (new hires, voluntary and involuntary terminations, internal transfers), computes cost-per-head by department and month, and estimates turnover cost as a multiple of departing employee salary for recruitment and ramp-up. Output includes stacked payroll cost bars by department, a headcount waterfall per month, cost-per-head trend lines, and a turnover cost impact summary table.
Simulate 2 years of monthly payroll for 200 employees across 6 departments. Allocate costs by salary, overtime, benefits, and taxes. Track headcount movements (hires, terminations, transfers) monthly. Compute cost-per-head by department, estimate turnover cost impact, and plot stacked payroll bars, a headcount waterfall, cost-per-head trends, and a turnover cost summary.
FinanceHR AnalyticsCost Allocation
36A
Tax Provision & Effective Tax Rate Reconciliation
A multinational group with subsidiaries in four jurisdictions needs to compute its consolidated income tax provision, reconcile the effective tax rate to the statutory rate, identify the main temporary differences driving deferred tax balances, and model the ETR sensitivity to a rate change in each country.
Quyi models a multinational with operations in the US (21%), UK (25%), Germany (29.9%), and Singapore (17%), computing current and deferred tax for each entity. It identifies and quantifies four categories of temporary differences — accelerated depreciation, provisions, stock-based compensation, and loss carryforwards — builds the group ETR reconciliation waterfall from the blended statutory rate to the effective rate, and models the ETR impact of a ±5% rate change in each jurisdiction. Output includes an ETR waterfall chart, a deferred tax movement schedule, and a jurisdiction rate sensitivity heatmap.
Model a multinational with operations in US, UK, Germany, and Singapore. Compute current and deferred tax for each entity. Identify temporary differences (accelerated depreciation, provisions, stock comp, loss carryforwards). Build the group ETR reconciliation, model ETR impact of a ±5% rate change per country, and plot an ETR waterfall, deferred tax movement schedule, and jurisdiction sensitivity heatmap.
TaxAccountingMultinational
37A
Financial Ratio Analysis & Industry Benchmarking
An investment analyst needs to compare 6 companies in the same sector across 5 years of financial statements, score each on liquidity, solvency, profitability, and efficiency, and produce a benchmarking pack that clearly ranks them and highlights areas of relative strength and weakness.
Quyi simulates 5 years of financial statements for all 6 companies, computes 16 ratios across four categories — liquidity (current ratio, quick ratio), solvency (D/E, interest coverage, debt/EBITDA), profitability (ROE, ROA, EBITDA margin, gross margin), and efficiency (asset turnover, receivables turnover, inventory days) — and builds a composite ranking score. For each company it generates a radar chart showing all ratio dimensions simultaneously, and produces a cross-company benchmarking heatmap with quartile coloring, plus a 5-year trend chart for the three most discriminating ratios.
Simulate 5 years of financial statements for 6 companies in the same industry. Calculate liquidity, solvency, profitability, and efficiency ratios. Rank companies by composite score, generate a radar chart per company, produce a benchmarking heatmap with quartile coloring, and plot 5-year trends for the three most discriminating ratios.
FinanceBenchmarkingRatio Analysis
38A
COGS & Gross Margin Bridge Analysis
A consumer goods company's finance director needs to explain to the board why gross margin moved year over year across 8 product lines — separating the effects of volume growth, price changes, product mix shifts, and input cost movements so that each business unit can be held accountable for its contribution.
Quyi simulates 3 years of monthly manufacturing and sales data for 8 product lines, decomposes COGS into raw materials, direct labor, and overhead, and builds a gross margin bridge from year to year showing four driver categories: volume effect, price/rate effect, product mix effect, and cost inflation effect. It identifies the two product lines acting as margin diluters, computes contribution margin ranking by SKU, and produces a management-ready output with a bridge waterfall chart for each year-over-year transition and a margin-by-product heatmap.
Simulate 3 years of monthly manufacturing data for a consumer goods company with 8 product lines. Decompose COGS into raw materials, labor, and overhead. Build a gross margin bridge (volume, price, mix, cost effects) for each year-over-year transition. Identify margin diluters, rank SKUs by contribution margin, and produce waterfall bridge charts and a margin-by-product heatmap.
AccountingGross MarginManagement Reporting
39A
Accounts Payable Optimization & Supplier Risk
A procurement finance team wants to optimize its AP strategy across 80 suppliers — identifying where early payment discounts are being foregone, spotting suppliers that are increasingly slow to invoice (a fraud signal), and calculating the liquidity cost-benefit of extending payment terms selectively.
Quyi simulates 24 months of AP transactions across 80 suppliers, computes DPO trends by supplier and category, identifies 2/10 net 30 discount opportunities and calculates the implied annualized cost of not taking them (comparing against the current cost of capital), flags anomalies including duplicate payment patterns, round-number invoices, and weekend submission dates, and models the net working capital impact of a 5-day DPO extension across the top-20 suppliers by spend. Output includes a supplier concentration chart, DPO trend by category, a discount opportunity prioritization table, and an anomaly flag report.
Simulate 24 months of AP transactions across 80 suppliers. Calculate DPO trends by supplier and category. Identify early payment discount opportunities (2/10 net 30) and compute the annualized cost of forgoing them vs. cost of capital. Flag duplicate payments, round-number invoices, and weekend submissions. Model the working capital impact of 5-day DPO extension and produce a supplier concentration chart and anomaly flag report.
AccountingAPFraud Detection
40A
Financial Statement Fraud Detection via Benford's Law
An internal audit team wants to screen a company's general ledger for statistical anomalies that may indicate manipulation — using data analytics to surface the entries most worthy of manual review before the year-end audit.
Quyi generates a simulated general ledger of 10,000 journal entries across 15 accounts, applies Benford's Law to first-digit and second-digit distributions, computes chi-square test statistics and Z-scores for each digit position, and flags additional anomaly patterns: round-number clustering, entries posted outside business hours or on public holidays, unusual debit/credit account combinations, and entries by users with elevated access. Each entry receives a composite anomaly risk score. Output includes digit distribution charts vs. expected Benford curves with significance bands, a ranked exception list sorted by risk score, and a management summary suitable for presenting to the audit committee.
Apply Benford's Law to a simulated general ledger of 10,000 journal entries. Compute first- and second-digit frequency distributions, chi-square test statistics, and Z-scores. Flag round-number clustering, after-hours postings, holiday entries, and unusual account combinations. Assign composite anomaly risk scores, plot digit distributions vs. Benford curves, and produce a ranked exception list and audit committee summary.
AuditFraud DetectionForensic Accounting
🔬 More Academia Advanced Research
26
Single-Cell RNA-seq Clustering & Trajectory
A stem cell biologist has scRNA-seq data from 8,400 cells across two differentiation timepoints and needs to identify cell subtypes, annotate clusters by marker genes, and reconstruct the differentiation trajectory from progenitor to mature cell states.
Quyi normalizes and log-transforms the count matrix, reduces dimensionality with PCA followed by UMAP, clusters cells using Leiden algorithm at multiple resolutions, identifies marker genes for each cluster via Wilcoxon rank-sum tests with Bonferroni correction, annotates clusters against a cell-type marker database from uploaded literature, constructs a pseudotime trajectory using diffusion maps connecting progenitor to mature states, identifies genes whose expression changes significantly along the trajectory, and generates a complete figure panel of UMAP embeddings with cluster labels, dot plots of marker expression, and pseudotime gradient plots — formatted for a Cell or Nature Methods submission.
Analyze my scRNA-seq count matrix (8,400 cells, 2 timepoints). Normalize, reduce with PCA+UMAP, cluster with Leiden algorithm, identify marker genes per cluster with Bonferroni correction, annotate cell types from my uploaded marker reference, reconstruct pseudotime trajectory with diffusion maps, identify trajectory-associated genes, and generate a publication figure panel.
Single-CellGenomicsTrajectory
27
Protein Mutation Effect on Stability
A structural biologist has thermodynamic stability measurements (ΔΔG) for 220 point mutations in a therapeutic antibody and needs to identify which physicochemical features best predict destabilizing mutations, to guide rational engineering of the next variant.
Quyi encodes each mutation with 14 physicochemical features (polarity change, hydrophobicity shift, charge delta, residue volume, evolutionary conservation score from MSA, B-factor, solvent accessibility, secondary structure context), trains and compares ridge regression, random forest, and gradient boosting models with nested cross-validation, reports feature importances and partial dependence plots for the top predictors, identifies the mechanistic rules that separate stabilizing from destabilizing mutations, and generates an engineering guidance table ranking candidate mutations by predicted ΔΔG with confidence intervals — ready to guide the next round of experimental synthesis.
Predict ΔΔG from 14 physicochemical features for my 220 antibody point mutations. Train ridge regression, random forest, and XGBoost with nested cross-validation, report feature importances and partial dependence plots, identify mechanistic rules for stabilizing vs. destabilizing mutations, and generate a ranked mutation guidance table with predicted ΔΔG and CI.
Structural BiologyProtein Engineering
28
Bayesian Hierarchical Multi-Site Clinical Study
A statistician is analyzing a 6-site clinical study where site-to-site variability in outcomes is expected, treatment effects may differ between sites, and the sample size per site (15–45 patients) is too small for reliable site-specific frequentist estimates.
Quyi builds a Bayesian hierarchical model with site-level random effects for both intercept and treatment slope, fits the model using MCMC with 4 chains and convergence diagnostics (R-hat, effective sample size), estimates the overall pooled treatment effect with 95% credible interval, estimates site-specific effects with partial pooling, generates a caterpillar plot of site effects, performs posterior predictive checks, and writes a statistical analysis plan-style results section explaining how the hierarchical model improves on both fixed-effects (ignores site variation) and fully unpooled (insufficient power) alternatives.
Fit a Bayesian hierarchical model to my 6-site clinical study data. Model site-level random intercepts and slopes, run MCMC with 4 chains, check convergence with R-hat and ESS, report pooled and site-specific treatment effects with 95% CrI, generate a caterpillar plot, run posterior predictive checks, and write a results section comparing the hierarchical approach to fixed-effects and unpooled alternatives.
BayesianClinical StatisticsMulti-Site
29
Phylogenetic Tree Reconstruction
A virologist has 48 full-genome sequences of an emerging RNA virus collected across 6 countries over 18 months and needs to reconstruct the evolutionary history, estimate divergence dates, and infer transmission routes between countries.
Quyi aligns the sequences, calculates a substitution rate-appropriate evolutionary model using model selection criteria, reconstructs a maximum likelihood phylogeny with bootstrap support values, fits a Bayesian molecular clock model to estimate divergence dates with 95% highest posterior density intervals, annotates the tree by country of origin to visualize geographic clustering, uses parsimony to infer the most probable international transmission events, and generates a publication-quality annotated phylogenetic tree with timeline axis and a transmission network diagram — with a virology methods section citing foundational phylodynamics literature.
Reconstruct the phylogeny of my 48 RNA virus genomes from 6 countries. Align sequences, select the best evolutionary model, build ML tree with bootstrap, fit Bayesian molecular clock for divergence dating, annotate by country, infer transmission routes, generate an annotated phylogenetic tree with timeline, and write a viral phylodynamics methods section.
VirologyPhylogeneticsEpidemiology
30
Seismic Signal Analysis and Event Classification
A seismologist has one year of continuous waveform data from a regional network of 12 stations and needs to detect microseismic events, calculate their magnitude and location, and classify events as tectonic, induced, or noise — to assess whether nearby industrial activity is triggering seismicity.
Quyi applies a STA/LTA detector to each station's waveform to identify candidate events, uses cross-correlation across station pairs to estimate arrival time differences and triangulate hypocentral locations, calculates local magnitude from peak-to-peak amplitudes calibrated to the regional attenuation model, trains a random forest classifier on time-frequency features (spectral centroid, kurtosis, P/S amplitude ratio) to distinguish tectonic from induced events, generates a space-time seismicity map, and produces a seismic hazard assessment report comparing induced event rates before and after the industrial activity began.
Detect and classify microseismic events in my 12-station waveform year. Apply STA/LTA detection, triangulate hypocenters via cross-correlation, calculate local magnitudes, classify events (tectonic/induced/noise) with random forest on spectral features, generate space-time seismicity maps, and produce a hazard assessment comparing event rates before and after the industrial activity start date.
SeismologyGeophysicsHazard
31
Quantum Chemistry Potential Energy Surface
A computational chemist has DFT-calculated energies for 340 molecular geometries along a reaction coordinate and needs to fit a smooth potential energy surface, identify transition states and energy barriers, and compute thermodynamic quantities at reaction conditions.
Quyi fits a high-dimensional polynomial or spline potential energy surface to the 340 geometry-energy pairs, locates stationary points by finding zero gradients analytically, identifies transition state geometries and calculates forward and reverse activation barriers with zero-point energy correction, uses the fitted surface to compute classical and quantum mechanical rate constants via transition state theory and the Wigner tunneling correction, generates a 2D PES contour map with reaction coordinate overlaid, and produces a thermochemistry table of ΔH, ΔG, ΔS, and Keq at 298 K and at user-specified reaction temperatures — formatted as supplementary material for a JACS submission.
Fit a potential energy surface to my 340 DFT geometry-energy points. Locate stationary points, calculate forward/reverse activation barriers with ZPE correction, compute classical and quantum rate constants via TST with Wigner correction, generate a 2D PES contour map, and produce a thermochemistry table (ΔH, ΔG, ΔS, Keq) at 298 K and my reaction temperature.
Computational ChemistryDFTReaction Kinetics
32
Ocean Acidification Carbonate System Analysis
A marine chemist has 15 years of monthly seawater measurements (pH, total alkalinity, DIC) from an open-ocean mooring and needs to characterize seasonal cycles, long-term acidification trends, and calculate the full carbonate system including aragonite saturation state relevant to coral bleaching risk.
Quyi calculates the complete carbonate system (pCO₂, [CO₃²⁻], [HCO₃⁻], aragonite saturation Ω_Ar, calcite saturation Ω_Ca) from the measured pH and alkalinity using thermodynamic equilibrium constants, applies STL seasonal decomposition to isolate biological and physical seasonal drivers, fits a linear trend to the residual to quantify acidification rate in pH units per decade, identifies years where Ω_Ar fell below the bleaching threshold, correlates with SST anomalies, and generates an ocean acidification trend report with seasonal cycle plots, long-term trend visualization, and saturation state time series — with a marine chemistry methods section.
Analyze 15 years of monthly seawater pH and alkalinity from my mooring. Calculate the full carbonate system (pCO₂, aragonite and calcite saturation, HCO₃⁻, CO₃²⁻), apply STL decomposition, quantify the acidification trend in pH/decade, identify bleaching-risk saturation events, correlate with SST anomalies, and generate a marine chemistry trend report.
Marine ChemistryOcean ScienceClimate
33
Stellar Spectra Classification and Abundance
An astrophysicist has optical spectra for 280 stars from a survey telescope and needs to classify them by spectral type, estimate effective temperature, surface gravity, and metallicity from the spectra, and identify chemical peculiarities consistent with binary mass transfer.
Quyi applies principal component analysis to the continuum-normalized spectra to extract the dominant spectral variance components, trains a random forest classifier on equivalent widths of key diagnostic lines (Ca H&K, Hα, Mg b, Na D) to assign MK spectral types, fits synthetic spectral templates to estimate Teff, log g, and [Fe/H] for each star, flags stars with abundance anomalies relative to the solar neighborhood metallicity distribution, generates a color-magnitude diagram and Hertzsprung-Russell diagram for the sample, and identifies candidate chemically peculiar stars for follow-up observation — producing an astronomical catalog-format results table.
Classify 280 stellar spectra from my survey. Apply PCA, train a random forest classifier on line equivalent widths, estimate Teff/log g/[Fe/H] via template fitting, flag abundance anomalies, generate HR and color-magnitude diagrams, and produce an astronomical catalog table of spectral types, atmospheric parameters, and chemical peculiarity flags.
AstrophysicsSpectroscopyStellar
34
Archaeological Radiocarbon Chronology
An archaeologist has 32 radiocarbon dates from stratified contexts across a Bronze Age site and needs to build a Bayesian chronological model that incorporates stratigraphic ordering constraints to produce calibrated date estimates for occupation phases and cultural transitions.
Quyi calibrates each raw ¹⁴C date against the IntCal calibration curve to produce probability distributions, builds a Bayesian sequence model with stratigraphic constraints encoded as phase boundaries, estimates the start and end dates of each occupation phase as posterior distributions using MCMC, tests for outliers using the Agreement Index, generates a probability distribution plot for each date and a site chronology overview figure, and writes an archaeological chronology report stating the calibrated date ranges in standard format (cal BP and cal BCE) with 68% and 95% credible intervals — citing the relevant calibration curve literature.
Build a Bayesian radiocarbon chronology for my 32 ¹⁴C dates from a stratified Bronze Age site. Calibrate against IntCal, encode stratigraphic ordering as sequence model constraints, estimate phase start/end dates with 68% and 95% CrI, test for outliers with Agreement Index, generate a chronology overview figure, and write an archaeological chronology results section.
ArchaeologyBayesianGeochronology
35
Social Network Community Detection and Influence
A sociologist studying scientific collaboration has a co-authorship network of 1,200 researchers and 6,500 collaboration edges and needs to identify research communities, measure individual influence, detect structural holes, and analyze how network position predicts citation impact.
Quyi constructs the co-authorship graph, detects communities using Louvain modularity optimization, calculates degree, betweenness, eigenvector, and PageRank centrality for every node, identifies structural hole brokers using Burt's effective network size and constraint metrics, runs a regression analysis predicting citation h-index from network position features while controlling for career age and field, generates a force-directed network visualization colored by community with node size proportional to PageRank, and produces a sociological network analysis report with implications for research policy regarding collaboration support and interdisciplinary bridge-building.
Analyze my 1,200-researcher co-authorship network (6,500 edges). Detect communities via Louvain, calculate degree/betweenness/eigenvector/PageRank centrality, identify structural hole brokers with Burt's constraint, regress h-index on network position features controlling for career age, generate a colored force-directed network visualization, and write a sociological network analysis report.
Network ScienceSociologyBibliometrics
36
Drug Combination Synergy Analysis
A pharmacologist has dose-response data from a 6×6 drug combination matrix experiment for two anticancer compounds tested in three cell lines and needs to quantify synergy, distinguish it from additivity and antagonism, and identify the concentration ranges where synergy is strongest.
Quyi fits four-parameter logistic Hill curves to each single-agent dose-response, constructs Bliss independence, Loewe additivity, and HSA null models for the combination, calculates synergy scores and volume under the synergy surface for each model across all three cell lines, generates interaction landscape heatmaps and 3D synergy surface plots for each null model, applies the ZIP (zero interaction potency) model to identify the dose regions of maximum synergy, and produces a pharmacology report comparing synergy results across cell lines and discussing mechanistic hypotheses consistent with the synergy patterns.
Analyze drug combination synergy in my 6×6 dose-response matrix for two anticancer compounds across 3 cell lines. Fit Hill curves, calculate Bliss/Loewe/HSA and ZIP synergy scores, generate interaction landscape heatmaps and 3D synergy surfaces, identify maximum synergy dose regions, and write a pharmacology report comparing synergy across cell lines with mechanistic hypotheses.
PharmacologyDrug SynergyOncology
37
Gait Biomechanics Asymmetry Analysis
A biomechanics researcher has 3D motion capture and force plate data from 22 participants with anterior cruciate ligament reconstruction and 22 matched controls walking at self-selected speed, and needs to quantify gait asymmetry and identify the biomechanical compensations that distinguish recovered from non-recovered patients.
Quyi extracts 28 gait parameters per limb per participant (stance time, peak knee flexion, hip extension moment, vertical GRF loading rate, ankle push-off impulse, etc.), calculates symmetry indices comparing operated to non-operated limbs in the ACL group and preferred to non-preferred in controls, runs MANOVA to test group differences, applies stepwise discriminant analysis to identify the minimum parameter set that distinguishes ACL-R from control gait, generates mean ± SD waveform plots for the key kinetic and kinematic variables with shaded confidence bands, and produces a clinical biomechanics report recommending targeted rehabilitation focuses based on the discriminating parameters.
Analyze 3D gait biomechanics for 22 ACL-R patients and 22 controls. Extract 28 gait parameters, calculate symmetry indices, run MANOVA for group differences, apply discriminant analysis to find the minimum parameter set distinguishing groups, generate kinematic and kinetic waveform plots with CI bands, and produce a clinical rehabilitation report.
BiomechanicsRehabilitationSports Medicine
38
Corpus Linguistic Change Over Time
A computational linguist has a digitized corpus of 4.2 million words from English newspapers published between 1850 and 1950 and wants to track semantic change in a set of 30 politically charged words, test whether change accelerates during wartime periods, and visualize distributional drift.
Quyi trains decade-level word2vec embeddings on the corpus, aligns embedding spaces across decades using orthogonal Procrustes transformations to make word positions comparable across time, calculates cosine similarity drift for each target word relative to its 1850 position, fits a changepoint model to detect when each word's usage shifted most rapidly, compares drift rates in war decades vs. peacetime with a permutation test, generates 2D UMAP projections of each word's semantic neighborhood across time, and produces a linguistics paper-formatted analysis with distributional shift figures and discussion of sociohistorical factors driving the observed semantic changes.
Track semantic change for 30 politically charged words across my 1850–1950 newspaper corpus. Train decade word2vec embeddings, align spaces via Procrustes, calculate cosine drift over time, run changepoint detection, compare war vs. peacetime drift rates with permutation tests, generate UMAP neighborhood visualizations per decade, and write a computational linguistics analysis.
Computational LinguisticsNLPHistory
39
Soil Microbiome Alpha/Beta Diversity
A microbial ecologist has 16S rRNA amplicon sequencing data from 96 soil samples collected across an elevation gradient in two seasons and needs to characterize microbial diversity patterns, test whether elevation and season explain community composition, and identify which taxa drive community differentiation.
Quyi rarefies the OTU table to equal sequencing depth, calculates alpha diversity metrics (Shannon, Chao1, Faith's PD) and tests for elevation and season effects with linear mixed models, calculates Bray-Curtis and UniFrac beta diversity matrices, runs PERMANOVA to test multivariate community composition differences, performs dbRDA to partition variance explained by elevation vs. season vs. their interaction, generates NMDS ordination plots colored by elevation and season, identifies indicator taxa for each elevation band using indicator value analysis, and produces a microbial ecology results section formatted for the ISME Journal.
Analyze microbiome diversity in my 96 soil samples across an elevation gradient in 2 seasons. Rarefy OTU table, calculate Shannon/Chao1/Faith's PD alpha diversity with LMM tests, compute Bray-Curtis and UniFrac beta diversity, run PERMANOVA and dbRDA for elevation/season variance partitioning, generate NMDS plots, identify indicator taxa, and write an ISME Journal-style results section.
Microbial EcologyMicrobiomeDiversity
40
High Energy Physics Event Selection
A particle physicist has simulated signal and background events from a particle detector and needs to design an optimal event selection cut flow, train a multivariate discriminant to separate signal from background, and estimate the expected significance of a new physics signal.
Quyi constructs kinematic discriminating variables from the four-momentum data, builds a BDT (Boosted Decision Tree) classifier trained on signal vs. background Monte Carlo, optimizes cut thresholds by maximizing S/√B in a signal region, applies the selection to a validation region to check for overtraining, performs systematic uncertainty estimation by varying detector response parameters within their uncertainties, calculates the expected statistical significance using the Asimov dataset approximation for the signal hypothesis, and produces a physics analysis note with cut flow tables, ROC curves, BDT score distributions, and a significance estimate with systematic error breakdown — formatted to internal HEP collaboration standards.
Design an event selection for my signal/background MC datasets. Train a BDT discriminant, optimize S/√B cut, validate in a control region, estimate systematic uncertainties by varying detector parameters, calculate expected significance with the Asimov approximation, and produce an HEP analysis note with cut flow tables, ROC curves, and significance estimate with systematics.
Particle PhysicsHEPMachine Learning
41
EEG Motor Imagery Brain-Computer Interface
A BCI researcher has 64-channel EEG data from 15 subjects performing left-hand vs. right-hand motor imagery and needs to classify the two mental states, identify which spatial patterns discriminate them, and assess cross-subject generalization for a device that must work without per-subject calibration.
Quyi applies Common Spatial Pattern (CSP) filtering to extract discriminative spatial components, calculates log-band-power features in the mu (8–12 Hz) and beta (13–30 Hz) bands for each CSP component, trains LDA and SVM classifiers with 10-fold cross-validation per subject, evaluates leave-one-subject-out generalization accuracy, identifies the CSP components with highest discriminative power and plots their scalp topographies, runs a permutation significance test to confirm classification is above chance, and produces a BCI research report comparing within-subject and cross-subject accuracies with topographic maps of the neural sources contributing to the mental state distinction.
Classify left vs. right hand motor imagery in 64-channel EEG from 15 subjects. Apply CSP spatial filtering, extract mu and beta band power features, train LDA and SVM with 10-fold CV, evaluate leave-one-subject-out generalization, identify discriminative CSP components with scalp topography plots, run permutation significance tests, and write a BCI research report.
NeuroscienceBCIEEG
42
Nutritional Epidemiology Cohort Diet-Disease
A nutritional epidemiologist has a 12-year longitudinal cohort of 3,800 adults with dietary recall data, physical measurements, and cardiovascular event records, and needs to test the association between dietary patterns and incident CVD while properly handling confounding and competing risks.
Quyi derives dietary pattern scores using principal component analysis of 28 food groups, adjusts energy intake using the residual method, fits Cox proportional hazards models with Fine-Gray subdistribution for competing risks (non-CVD death), adjusts for age, sex, BMI, smoking, physical activity, SES, and medication use, tests for non-linear exposure-response relationships using restricted cubic splines, performs multiple imputation for missing covariates, tests for effect modification by sex and diabetes status, and generates a nutrition epidemiology results section with hazard ratio tables, spline plots of exposure-response curves, and a forest plot of subgroup analyses — formatted for the American Journal of Clinical Nutrition.
Analyze diet-CVD association in my 12-year cohort (n=3,800). Derive dietary pattern scores via PCA, adjust energy with residuals, fit Fine-Gray competing risk Cox models adjusting for 7 confounders, test non-linearity with restricted cubic splines, run multiple imputation, test effect modification by sex and diabetes, and generate an AJCN-formatted results section with HR tables, spline plots, and subgroup forest plot.
EpidemiologyNutritionCardiology
43
Polymer Rheology Master Curve Construction
A materials scientist has small-amplitude oscillatory shear data for a polymer melt measured at 8 temperatures from 140°C to 210°C and needs to construct a master curve at a reference temperature, extract relaxation spectrum, and determine viscoelastic parameters for process modeling.
Quyi applies time-temperature superposition using the WLF equation to calculate horizontal shift factors aT and vertical shift factors for each temperature, validates TTS applicability by checking superposition quality with a Cole-Cole plot, constructs the master storage and loss modulus curves G′(ω) and G″(ω) covering 10 decades of frequency at the reference temperature, fits the generalized Maxwell model to extract the discrete relaxation spectrum, calculates the zero-shear viscosity, plateau modulus, and characteristic relaxation time, and generates a rheology characterization report with master curve plots, shift factor Arrhenius plot, and relaxation spectrum bar chart — with parameters tabulated for input into finite element flow simulation software.
Construct a rheological master curve from my oscillatory shear data at 8 temperatures (140–210°C). Apply WLF time-temperature superposition, validate with Cole-Cole plot, build G′/G″ master curves over 10 frequency decades, fit generalized Maxwell model for relaxation spectrum, extract zero-shear viscosity and plateau modulus, and generate a rheology report with parameters ready for FE simulation.
Polymer ScienceRheologyMaterials
44
Paleoclimate Proxy Reconstruction
A paleoclimatologist has δ¹⁸O isotope records from a speleothem covering the last 12,000 years at sub-annual resolution and needs to reconstruct past temperature and precipitation, identify abrupt climate events, and compare the record to other regional proxies and model simulations.
Quyi converts the δ¹⁸O record to temperature and precipitation proxies using published cave-specific calibration equations from uploaded literature, applies age-depth modeling with Monte Carlo uncertainty propagation to produce a probabilistic age model, runs wavelet analysis to identify multi-decadal and centennial variability, detects abrupt transitions using RAMPFIT change-point detection, correlates the record with North Atlantic SST reconstructions and Greenland ice core records from uploaded datasets, and produces a paleoclimate results section with the proxy time series, wavelet scalogram, event timing comparison table, and correlation analysis with other archives — formatted for Quaternary Science Reviews.
Analyze my 12,000-year speleothem δ¹⁸O record. Apply calibration equations from my uploaded literature, build probabilistic age model with Monte Carlo propagation, run wavelet analysis, detect abrupt events with RAMPFIT, correlate with North Atlantic SST and Greenland ice core records in my uploaded datasets, and write a Quaternary Science Reviews-style results section with proxy time series, wavelet scalogram, and event comparison table.
PaleoclimatologyGeochemistryQuaternary
45
Cognitive Load ERP Component Analysis
A cognitive neuroscientist has ERP data from 30 participants in a 3-back working memory task versus fixation baseline, and needs to quantify the N-back effect on P300 and N200 components, test whether component latency and amplitude predict behavioral accuracy, and explore hemispheric asymmetry.
Quyi segments and baseline-corrects the epochs, applies artifact rejection using peak-to-peak amplitude thresholding and independent component analysis for eye blink removal, averages ERPs by condition and electrode for each participant, identifies P300 and N200 peak latency and amplitude within predefined time windows using centroid and peak detection, runs paired t-tests comparing component measures between conditions with Bonferroni correction, correlates component amplitudes with individual behavioral accuracy scores, tests hemispheric asymmetry with laterality indices, and generates grand-average ERP waveform plots for key electrodes, topographic maps at component peaks, and a cognitive neuroscience results section.
Analyze ERPs from 30 participants in a 3-back vs. baseline task. Apply artifact rejection with ICA, extract P300 and N200 peak latency and amplitude, compare conditions with paired t-tests and Bonferroni correction, correlate component amplitudes with behavioral accuracy, test hemispheric asymmetry, and produce grand-average ERP waveforms, topographic maps, and a cognitive neuroscience results section.
Cognitive NeuroscienceEEG/ERPWorking Memory
46
Protein-Protein Interaction Network Drug Targets
A systems biologist studying Alzheimer's disease has a disease-specific PPI network of 480 proteins and 2,100 interactions and needs to identify hub proteins, find druggable bottleneck nodes, predict synthetic lethal pairs, and map the network onto known disease pathways.
Quyi computes degree, betweenness, and closeness centrality and identifies hubs above 3 standard deviations from the mean degree, calculates network robustness under targeted hub removal versus random failure, overlays the network with druggability scores from uploaded ChEMBL data to identify bottleneck proteins that are both topologically critical and chemically tractable, runs a clustering-based synthetic lethality prediction to identify module pairs whose simultaneous removal would disrupt connected components, maps proteins to KEGG and Reactome pathways, and generates a systems biology report with a network visualization, druggability-centrality scatter plot, and prioritized drug target list with rationale — formatted for Nature Systems Biology.
Analyze my Alzheimer's PPI network (480 proteins, 2,100 interactions). Identify hubs (>3 SD degree), test robustness to targeted vs. random removal, overlay with ChEMBL druggability scores to find tractable bottleneck targets, predict synthetic lethal module pairs, map to KEGG/Reactome, and generate a systems biology report with network visualization, druggability-centrality plot, and prioritized target list.
Systems BiologyNetwork MedicineDrug Discovery
47
Crop Yield Response to Climate Variables
An agronomist has 30 years of wheat yield records from 140 farm locations across a region along with gridded climate data and needs to quantify how maximum temperature, rainfall distribution, and vapor pressure deficit during critical growth stages affect yields, and project changes under future climate.
Quyi engineers growing-season climate indices (GDD, heat stress days above 32°C during anthesis, drought stress index by Penman-Monteith, VPD mean and peak), runs panel regression with location fixed effects and year random effects to estimate yield responses to each climate driver, tests for non-linear threshold effects using piecewise linear regression, quantifies the relative importance of each climate driver using dominance analysis, applies the estimated response functions to downscaled CMIP6 climate projections to produce county-level yield change estimates under SSP2-4.5 and SSP5-8.5, and produces an agricultural impact assessment with yield response curves, regional change maps, and a climate adaptation report recommending crop varieties and planting date adjustments.
Model wheat yield response to climate for my 30-year, 140-location panel. Engineer growing-season indices (GDD, heat stress days, drought index, VPD), run panel regression with location FE and year RE, test non-linear thresholds, quantify relative importance via dominance analysis, project yield changes under SSP2-4.5 and SSP5-8.5, and generate a climate adaptation report with yield response curves and regional change maps.
AgricultureClimate ImpactFood Security
48
Turbulent Flow Simulation Validation
A fluid dynamics researcher has experimental PIV (Particle Image Velocimetry) velocity field measurements for turbulent channel flow and needs to validate a DNS simulation by comparing statistical turbulence quantities and establishing the uncertainty budget of the experimental measurements.
Quyi processes the PIV vector fields by detecting and replacing spurious vectors with bilinear interpolation, calculates mean velocity profiles, Reynolds stress profiles (⟨u′u′⟩, ⟨v′v′⟩, ⟨u′v′⟩), turbulent kinetic energy, and dissipation rate from the velocity gradients, compares each profile against the DNS data with point-wise relative error, performs Richardson extrapolation to estimate spatial resolution uncertainty, decomposes the measurement uncertainty into random and systematic components, generates a turbulence statistics comparison figure panel in wall units (y⁺), and produces a CFD validation report with a quantitative agreement table and recommendations for future experimental parameter adjustments.
Validate my DNS against PIV measurements for turbulent channel flow. Process PIV fields (spurious vector removal), calculate mean velocity and Reynolds stress profiles, compare all turbulence statistics point-wise with relative error, estimate spatial resolution uncertainty via Richardson extrapolation, decompose random vs. systematic uncertainty, generate wall-unit profile comparison figures, and write a CFD validation report.
Fluid DynamicsCFDExperimental
49
Functional MRI Resting-State Network Analysis
A psychiatry researcher has resting-state fMRI data from 45 patients with major depressive disorder and 45 matched healthy controls and needs to characterize differences in default mode network (DMN) functional connectivity and test whether connectivity strength predicts depression severity scores.
Quyi extracts ROI time series for 10 canonical resting-state networks using an atlas-based parcellation, calculates seed-based and ICA-based functional connectivity matrices, applies Fisher's z-transformation and runs two-sample t-tests comparing each network edge between groups with FDR correction, identifies disrupted connections exceeding a Cohen's d effect size threshold, runs linear regression predicting HDRS depression scores from DMN connectivity strength while controlling for age and medication status, generates a connectome circle plot of significant group differences and a scatter plot of the connectivity-symptom relationship, and produces a psychiatry fMRI results section formatted for JAMA Psychiatry.
Compare resting-state fMRI connectivity between 45 MDD patients and 45 controls. Extract ROI time series via atlas parcellation, compute functional connectivity matrices, compare groups with t-tests and FDR correction, identify edges exceeding Cohen's d threshold, regress HDRS scores on DMN connectivity controlling for age and medication, generate a connectome circle plot and connectivity-symptom scatter, and write a JAMA Psychiatry-style results section.
PsychiatryfMRIConnectivity
50
Instrumental Variable Analysis for Policy Evaluation
An economist evaluating the causal effect of education on lifetime earnings faces endogeneity from ability bias, and needs an instrumental variable strategy using distance to nearest college as an instrument, with careful validity testing and heterogeneous treatment effect analysis.
Quyi runs the first-stage regression of education on the instrument (distance to college) with controls, tests instrument relevance with the Cragg-Donald F-statistic and Stock-Yogo weak instrument critical values, checks the exclusion restriction by testing the instrument against alternative outcomes it should not affect, estimates the LATE (Local Average Treatment Effect) using 2SLS with heteroskedasticity-robust standard errors, performs sensitivity analysis using a Conley-Hansen confidence set relaxing the exclusion restriction by varying degrees, tests for heterogeneous returns to education by birth cohort, and produces an econometrics paper results section with first-stage and IV coefficient tables, instrument validity discussion, and a sensitivity analysis figure showing how results change as the exclusion restriction is relaxed.
Run an IV analysis of education on lifetime earnings using distance to college as instrument. Test relevance with Cragg-Donald F and Stock-Yogo critical values, check exclusion restriction against placebo outcomes, estimate LATE via 2SLS with robust SEs, perform Conley-Hansen sensitivity analysis relaxing the exclusion restriction, test for cohort heterogeneity, and produce an econometrics results section with instrument validity discussion and sensitivity figure.
EconomicsCausal InferenceEconometrics
⚙️ More Industry Advanced Applications
51
Supply Chain Disruption Risk Scoring
A procurement director has a supplier network with 280 tier-1 and tier-2 vendors across 34 countries and needs a risk scorecard that quantifies each supplier's disruption risk from geopolitical, financial, operational, and climate dimensions, and identifies critical single points of failure in the network.
Quyi builds a multi-factor supplier risk scorecard combining country-level geopolitical risk indices (from uploaded World Bank data), supplier financial health metrics (Z-score from financial statements), lead time variance and on-time delivery history, climate exposure scores from geographic coordinates, and product criticality weights, normalizes and aggregates into a composite risk score, maps supplier network dependencies to identify single-source items without qualified alternatives, runs network centrality analysis to find suppliers whose failure would cascade to the most product lines, generates a risk matrix visualization (likelihood vs. impact) and a procurement heat map by geography, and produces an executive procurement risk report with prioritized mitigation recommendations for the top 20 highest-risk suppliers.
Score disruption risk for my 280-supplier network across 34 countries. Combine geopolitical risk (World Bank data), financial health (Z-scores), delivery performance, and climate exposure into a composite scorecard, map network dependencies to find single points of failure, run centrality analysis for cascade risk, generate a risk matrix and geographic heat map, and produce an executive procurement report with top-20 mitigation priorities.
Supply ChainRiskProcurement
52
Dynamic Hotel Room Pricing Optimization
A revenue management team at a hotel chain has 3 years of booking data, competitor rate feeds, local event calendars, and weather forecasts and needs a dynamic pricing model that maximizes RevPAR while maintaining occupancy targets across 5 property types.
Quyi estimates a price elasticity model by room type and booking lead time using instrumental variable regression (using competitor rates as instruments), incorporates event, weather, and day-of-week effects via gradient boosting, formulates a constrained revenue optimization problem using the estimated demand function to produce optimal prices for each room type at each lead time bucket, back-tests the pricing strategy against 6 months of hold-out data and reports incremental RevPAR lift versus the current pricing policy, generates an optimal price surface visualization by lead time and day-of-week, and produces a revenue strategy report showing the P10/P50/P90 revenue distribution under the new policy across each property type.
Build a dynamic pricing model for 5 hotel property types using 3 years of booking data, competitor rates, event calendars, and weather forecasts. Estimate price elasticity by room type and lead time using IV regression, formulate RevPAR optimization, back-test against 6-month hold-out, generate optimal price surfaces, and produce a revenue strategy report with P10/P50/P90 distributions.
HospitalityRevenue ManagementOptimization
53
Mortgage Default Scorecard
A retail bank needs to replace an 8-year-old credit scorecard for mortgage origination decisions with a new model that meets IFRS 9 staging requirements, produces a compliant probability of default, and can be explained to regulators and auditors.
Quyi performs weight-of-evidence binning and information value calculation for all candidate variables, selects variables by IV threshold and correlation screening, builds a logistic regression scorecard with fine and coarse classing, validates discrimination (Gini, KS, AUROC) and calibration (Hosmer-Lemeshow, reliability diagram) on out-of-time and out-of-sample test sets, converts log-odds to a scaled scorecard with points-to-double-odds calibration, tests for demographic fairness using equal opportunity and disparate impact metrics across protected groups, and produces a model validation report formatted to EBA IRB model governance requirements with full variable documentation, segmentation analysis, and an override policy recommendation.
Build an IFRS 9 compliant mortgage PD scorecard. Perform WOE binning and IV selection, build logistic regression scorecard, validate Gini/KS/AUROC and calibration on out-of-time test, scale to points system, test demographic fairness for protected groups, and produce an EBA IRB-compliant model validation report with variable documentation and override policy.
BankingCredit RiskRegulatory
54
Real Estate Automated Valuation Model
A proptech company needs to build an automated valuation model for residential properties that is accurate enough to support mortgage underwriting, handles heterogeneous property types, and produces uncertainty estimates alongside point predictions.
Quyi engineers location features from coordinates (distance to CBD, nearest school rating, public transit accessibility score), extracts structural features from listing descriptions using NLP, handles missing values with multiple imputation, trains gradient boosting with quantile regression objectives to produce P10/P50/P90 price predictions, applies k-fold geographically blocked cross-validation to prevent spatial autocorrelation leakage, calculates MAPE and coverage of prediction intervals by property type and price band, uses SHAP to generate a value driver explanation for any individual property, and produces a valuation model technical report with spatial residual maps identifying geographic bias, accuracy by segment tables, and an API-ready model export format.
Build a residential AVM for mortgage underwriting. Engineer location and structural features, handle missing data with MI, train quantile gradient boosting for P10/P50/P90 predictions, validate with geographically blocked CV, calculate MAPE and interval coverage by type and price band, generate SHAP explanations, produce a technical report with spatial residual maps and an API export spec.
Real EstateValuationPropTech
55
Customer Lifetime Value Segmentation
An e-commerce business wants to replace simple RFM tiers with a CLV-based segmentation that captures the full expected future revenue contribution of each customer and enables differentiated retention, acquisition, and upsell investment decisions.
Quyi fits a non-contractual BG/NBD model for purchase frequency and a gamma-gamma model for monetary value, estimates 12-month and 24-month CLV distributions for each active customer, validates CLV estimates against held-out actuals by segment, runs k-means and Gaussian mixture model clustering on CLV percentile, recency, and category breadth to produce 5–7 behaviorally distinct segments, generates a CLV pyramid visualization, calculates the optimal customer acquisition cost (CAC) ceiling for each segment by CLV, and produces a customer strategy playbook recommending specific retention, upsell, and reactivation treatments per segment with expected ROI calculations.
Build a CLV segmentation model for my e-commerce customer base. Fit BG/NBD and gamma-gamma models, estimate 12- and 24-month CLV distributions, validate on held-out actuals, cluster customers on CLV/recency/breadth with GMM, generate CLV pyramid visualization, calculate optimal CAC ceiling per segment, and produce a strategy playbook with per-segment ROI for retention, upsell, and reactivation actions.
E-commerceCLVSegmentation
56
Network Intrusion Detection Anomaly Model
A cybersecurity team has 30 days of network flow logs from their enterprise infrastructure and needs to build a baseline behavioral model that detects anomalous connections indicative of lateral movement or data exfiltration, with explainable alerts that security analysts can act on.
Quyi aggregates raw flow logs into host-level behavioral profiles (connection count, data volume, protocol mix, new peer connections per hour), applies Isolation Forest and HBOS (Histogram-Based Outlier Score) models trained on the first 21 days, evaluates on 9 days of labeled anomaly data, calculates precision-recall at various threshold settings to find the operational point, produces alert explanations using SHAP highlighting the specific behavioral deviations driving each score, clusters flagged hosts by anomaly type to distinguish lateral movement patterns from exfiltration patterns, and generates a SOC operations report with triage queue, detection performance metrics, and recommended SIEM integration playbook.
Build a network intrusion detection model from 30 days of enterprise flow logs. Profile host behavior, train Isolation Forest and HBOS on first 21 days, evaluate on labeled 9-day test set, tune detection threshold via precision-recall, generate SHAP alert explanations, cluster anomaly types (lateral movement vs. exfiltration), and produce a SOC triage queue and SIEM integration playbook.
CybersecurityAnomaly DetectionSOC
57
Pharmaceutical Clinical Site Performance
A clinical operations team managing a Phase III trial at 38 sites needs to identify underperforming sites early based on enrollment rate, protocol deviation frequency, data query volume, and patient dropout, to enable proactive risk-based monitoring before timeline delays occur.
Quyi builds a multi-metric site performance index by standardizing enrollment rate vs. target, query rate per CRF page, protocol deviation rate per patient-month, and dropout rate relative to the country average, applies hierarchical clustering to group sites by performance profile, flags sites in the bottom performance quartile for urgent monitoring visits, runs a Poisson regression modeling protocol deviation counts as a function of site characteristics (country, staff experience, patient complexity) to identify structural vs. operational underperformance, calculates projected trial completion date under current enrollment trajectories with confidence intervals, and generates a clinical operations dashboard with site ranking, risk flags, projected timeline, and recommended monitoring frequency by site risk tier.
Score performance for 38 Phase III trial sites on enrollment rate, query rate, deviation rate, and dropout. Build a composite performance index, cluster sites by profile, flag bottom quartile for monitoring, model deviation counts with Poisson regression by site characteristics, project trial completion with CIs, and generate a clinical operations dashboard with risk flags and monitoring recommendations.
PharmaClinical OperationsRisk-Based Monitoring
58
Pharmaceutical Distribution Demand Forecasting
A pharmaceutical distributor needs monthly demand forecasts at the SKU-warehouse-channel level for 1,200 products to optimize safety stock and reduce both stockouts of critical medicines and overstock write-offs of short-shelf-life products.
Quyi segments SKUs into intermittent (Croston), seasonal (SARIMA), and regular demand profiles, incorporates patient episode data and prescription trend uploads as leading indicators, builds a reconciled hierarchical forecast ensuring SKU forecasts sum consistently to national totals, generates 3-month and 6-month ahead forecasts with safety stock recommendations based on service level targets (97.5% for critical medicines, 95% for standard), flags products with coefficient of variation above threshold for manual review, calculates the projected stockout reduction and working capital release vs. current static safety stock policy, and produces a supply chain planning report with SKU-level forecast tables, safety stock parameters, and a prioritized exception list for the planning team.
Forecast monthly demand for 1,200 pharma SKUs across 3 warehouses and 2 channels. Segment by demand profile, incorporate prescription trend data as leading indicators, build hierarchical reconciled forecasts, generate 3- and 6-month horizons with safety stock at 97.5%/95% service levels, flag high-CV SKUs, calculate working capital impact vs. current policy, and produce a supply planning report with exception list.
PharmaSupply ChainForecasting
59
Construction Cost Overrun Prediction
A construction project management firm has historical data from 340 completed projects including planned vs. actual cost and duration, project characteristics, contract type, and weather data, and wants a model to flag new projects at risk of significant overrun before mobilization.
Quyi engineers features from project scope documents, contract type, location risk, team experience, and seasonal weather patterns during the construction window, trains XGBoost and survival analysis models predicting both binary overrun (>10% cost) and continuous overrun magnitude, applies permutation importance and SHAP to identify the most predictive risk factors, segments projects by type (infrastructure, commercial, residential) and assesses model performance within each segment, generates a pre-mobilization risk score for 15 new projects in the current pipeline, and produces a project risk report ranking pipeline projects by overrun probability with the specific risk drivers highlighted for each project and recommended contract clause adjustments for the highest-risk projects.
Build a cost overrun prediction model from 340 completed construction projects. Engineer scope, contract, location, and weather features, train XGBoost and survival model for overrun probability and magnitude, apply SHAP for risk factor analysis, segment by project type, score 15 pipeline projects, and produce a risk report with per-project overrun drivers and recommended contract adjustments.
ConstructionProject RiskContract
60
Brand Sentiment and Competitor NLP Analysis
A consumer brand wants to understand how it is perceived relative to three competitors across 84,000 social media posts and customer reviews, which product attributes drive positive vs. negative sentiment, and how perception has shifted over the past 12 months.
Quyi applies aspect-based sentiment analysis to extract sentiment scores for product attributes (quality, price, service, delivery, packaging) separately from overall sentiment, trains an LDA topic model to surface latent themes in positive and negative reviews for each brand, calculates monthly sentiment trend indices for each brand-attribute combination, runs change-point detection to identify moments of significant sentiment shift, compares brand mention volume and sentiment velocity against the three competitors, generates a competitive perception map (2D positioning from sentiment dimensions), and produces a brand intelligence report with attribute sentiment radar charts, trend lines, competitive comparison tables, and a strategic communications brief highlighting the brand's perception gaps and strengths — ready for the CMO.
Analyze 84,000 social posts and reviews across my brand and 3 competitors. Run aspect-based sentiment for 5 product attributes, fit LDA topic model for positive/negative themes, calculate monthly sentiment indices, detect change-points, generate competitive perception map, and produce a CMO-ready brand intelligence report with radar charts, trend lines, and strategic communications brief.
MarketingNLPBrand
61
Actuarial Loss Reserve Modeling
An insurance actuary needs to estimate outstanding loss reserves for a commercial liability portfolio using multiple reserving methods, quantify reserve uncertainty for Solvency II capital purposes, and produce a board-level reserve adequacy opinion.
Quyi runs chain-ladder, Bornhuetter-Ferguson, and Cape Cod reserving methods on the paid and incurred development triangles, tests development pattern stability with age-to-age factor weighted averages and volume-weighted selections, applies the Mack method to estimate parameter uncertainty and the bootstrapped over-dispersed Poisson to generate a full predictive distribution of the reserve, calculates the 75th, 95th, and 99.5th percentile reserve for SCR purposes, compares reserve estimates across methods and reconciles divergences with narrative, and produces a GAAP and Solvency II reserve report with development triangles, selected factors, reserve summary by accident year, uncertainty distribution, and a board-level reserve adequacy opinion letter.
Run loss reserve analysis on my commercial liability triangles. Apply chain-ladder, Bornhuetter-Ferguson, and Cape Cod methods, test factor stability, bootstrap ODP reserve distribution, calculate 75th/95th/99.5th percentile for SCR, reconcile method divergences, and produce a Solvency II reserve report with development triangles, reserve summary, uncertainty distribution, and board-level reserve opinion.
InsuranceActuarialSolvency II
62
HR Workforce Attrition Analytics
A CHRO has workforce data covering 5 years of employee records, performance reviews, engagement survey responses, and exit interview codes for 12,000 employees across 8 business units, and needs to understand why attrition is concentrated in specific roles and units, and what predicts voluntary departure 6 months ahead.
Quyi calculates voluntary attrition rates by role, business unit, tenure band, and manager, runs survival analysis to model time-to-attrition by employee segment, trains a 6-month departure prediction model using gradient boosting on engagement scores, performance trajectory, promotion recency, compensation competitiveness, manager change frequency, and commute distance, validates out-of-time, generates SHAP explanations of the top attrition drivers by role, tests for pay equity issues between demographic groups, and produces an HR analytics report with attrition heatmap, survival curves by segment, a ranked retention risk list of current employees, pay equity analysis, and a business unit action plan with estimated retention ROI from targeted interventions.
Analyze 5-year workforce attrition across 12,000 employees in 8 BUs. Calculate attrition rates by role/BU/tenure, run survival analysis by segment, train 6-month departure prediction with XGBoost on engagement/performance/comp features, validate out-of-time, generate SHAP driver analysis by role, run pay equity test, and produce an HR analytics report with heatmap, survival curves, retention risk list, and BU action plan.
HR AnalyticsAttritionWorkforce
63
Marketing Mix Modeling and Budget Optimization
A CMO needs to understand the revenue attribution of each marketing channel (paid search, display, TV, email, organic, influencer) over the past 2 years and reallocate a $24M annual budget to maximize incremental revenue at the corporate portfolio level.
Quyi builds a Bayesian marketing mix model with adstock transformations (saturation and decay curves) for each channel, incorporates seasonality, macroeconomic controls, and price/promotion variables as baseline factors, estimates media ROI posteriors with credible intervals for each channel, fits diminishing returns curves to identify each channel's saturation point and marginal ROI at the current spend level, formulates a constrained budget optimization problem maximizing expected revenue subject to channel-level minimum and maximum guardrails, generates the optimal budget allocation and projects the incremental revenue lift vs. current allocation, and produces a CFO-ready marketing effectiveness report with waterfall charts, channel ROI ranking, saturation curves, and the recommended reallocation plan with uncertainty bounds.
Build a Bayesian marketing mix model for 2 years of weekly revenue and spend data across 6 channels. Apply adstock transformations, estimate channel ROI posteriors, fit saturation/diminishing returns curves, optimize $24M budget allocation with channel guardrails, project incremental revenue lift, and produce a CFO-ready report with waterfall charts, ROI ranking, saturation curves, and reallocation plan.
MarketingMMMBudget Optimization
64
Last-Mile Delivery Route Optimization
A logistics operations team needs to optimize daily delivery routes for a fleet of 45 vehicles across an urban region with 800–1,200 daily delivery stops, accounting for vehicle capacity, time window constraints, traffic patterns, and driver shift limits — to reduce cost per delivery.
Quyi formulates the problem as a Capacitated Vehicle Routing Problem with Time Windows (CVRPTW), applies a hybrid metaheuristic (Large Neighborhood Search + local optimization) to generate near-optimal route solutions, benchmarks solution quality against the current manually planned routes, calculates the theoretical lower bound using linear programming relaxation to quantify the optimality gap, performs sensitivity analysis on key parameters (time window tightness, vehicle capacity), generates visualized route maps with color-coded vehicle assignments and delivery sequence, calculates cost per delivery, total vehicle-hours, and empty return mileage metrics, and produces a logistics operations report with route configuration files, performance comparison vs. baseline, and implementation recommendations for the dispatch system.
Optimize daily last-mile routes for 45 vehicles with 800–1,200 stops, capacity constraints, and time windows. Solve CVRPTW with Large Neighborhood Search, benchmark vs. current manual routing, calculate LP lower bound for optimality gap, run sensitivity analysis on time windows and capacity, generate route maps with vehicle assignments, report cost/delivery and empty mileage, and produce a dispatch operations report.
LogisticsOptimizationOperations
65
Data Center Power Usage Effectiveness Optimization
A cloud infrastructure operator needs to reduce the PUE (Power Usage Effectiveness) of a hyperscale data center by identifying the operating conditions that minimize cooling overhead, quantifying the contribution of each system component, and recommending HVAC setpoint adjustments.
Quyi ingests 18 months of 15-minute interval sensor data covering IT load, cooling system states, CRAC unit settings, chiller parameters, outside air temperature and humidity, and server inlet temperatures, builds a gradient boosting model predicting PUE from operational parameters with SHAP importance decomposition, identifies the optimal cooling setpoints for each outdoor temperature and IT load operating regime using a grid search validated against holdout data, estimates the annual kWh savings from each recommended setpoint adjustment, performs fault detection by flagging sensor readings that deviate from the model's expected values under current conditions, and generates a data center optimization report with PUE decomposition, operating regime maps, recommended setpoint tables, energy savings projections, and a fault detection anomaly log.
Analyze 18 months of 15-minute data center sensor data. Build a PUE prediction model with SHAP decomposition, optimize cooling setpoints by outdoor temperature and IT load regime, estimate annual kWh savings per recommendation, run fault detection against model expectations, and produce an optimization report with PUE decomposition, operating regime maps, setpoint tables, and anomaly log.
Data CenterEnergyOperations
66
Pharmacovigilance Adverse Event Signal Detection
A pharmacovigilance team needs to screen a safety database of 2.8 million spontaneous reports to detect disproportionate drug-adverse event associations that warrant expedited safety review, in compliance with ICH E2C and EMA GVP Module IX requirements.
Quyi calculates disproportionality statistics for every drug-adverse event pair using Reporting Odds Ratio (ROR), Proportional Reporting Ratio (PRR), and Bayesian Information Component (IC) with shrinkage, applies the Evans criteria and IC025 threshold to identify signals meeting standard detection thresholds, runs the Multi-Item Gamma Poisson Shrinker (MGPS) algorithm for case series analysis, stratifies by age, sex, region, and indication to detect subgroup-specific signals, performs temporal trend analysis to identify signals that have recently accelerated, and produces a pharmacovigilance signal detection report listing detected signals ranked by signal strength, time-to-onset analysis, seriousness breakdown, and a case narrative summary for the top 10 signals — formatted for regulatory submission under ICH E2C.
Run pharmacovigilance signal detection on my 2.8M spontaneous report database. Calculate ROR, PRR, and IC with shrinkage, apply Evans criteria and IC025 thresholds, run MGPS case series analysis, stratify signals by age/sex/region, analyze temporal trends, and produce an ICH E2C-compliant signal detection report with top-10 signal narratives, time-to-onset analysis, and seriousness breakdown.
PharmacovigilanceDrug SafetyRegulatory
67
Retail Personalization Propensity Models
A retail marketing team needs next-best-offer models for their 1.4M loyalty card customers to power personalized email and app push campaigns across 6 product categories, replacing a manually crafted segmentation approach that hasn't been updated in 3 years.
Quyi builds six parallel propensity-to-buy models (one per category) using gradient boosting on purchase history, browsing behavior, demographic proxies, seasonal affinity, and recency-frequency-monetary features, applies calibrated probability outputs using isotonic regression, constructs a decision matrix that assigns each customer to the single highest-incremental-lift offer per channel per week, controls for selection bias using propensity score weighting on the holdout validation, estimates the incremental revenue per customer contact vs. random assignment, generates a campaign targeting list with offer assignments, and produces a personalization strategy report with model performance by category, incremental lift estimates, recommended refresh frequency, and data quality requirements — with the full model pipeline ready for integration into the CRM system.
Build 6 next-best-offer propensity models for 1.4M loyalty customers across product categories. Train gradient boosting on purchase/browse/RFM features, calibrate probabilities, construct a weekly offer decision matrix maximizing incremental lift, validate with propensity score weighting, estimate revenue per contact, generate targeting lists, and produce a CRM integration-ready personalization strategy report.
RetailPersonalizationCRM
68
Oil & Gas Production Decline Curve Analysis
A reservoir engineer has production data for 60 wells in a shale gas field and needs to forecast ultimate recovery, estimate remaining reserves, identify wells underperforming relative to type curve, and prioritize candidates for workover or restimulation.
Quyi fits Arps exponential, hyperbolic, and harmonic decline curves to each well's production history using nonlinear regression, applies the Duong rate transient model for wells with transient linear flow, estimates EUR (Estimated Ultimate Recovery) with Monte Carlo uncertainty bounds for each well, constructs a type curve from the best-performing wells using normalized production analysis, flags wells producing below the P50 type curve trajectory as underperforming candidates, calculates the NPV impact of restimulation for each underperformer using the user-provided economic parameters, and generates a reserves report with per-well EUR distributions, a field-level reserves summary, a type curve comparison waterfall chart, and a ranked workover priority list with economic justification.
Analyze production decline for 60 shale gas wells. Fit Arps and Duong models, estimate EUR with Monte Carlo uncertainty, construct a P50 type curve from top performers, flag underperforming wells, calculate restimulation NPV for each candidate using my economic parameters, and generate a reserves report with EUR distributions, type curve comparison, and ranked workover priority list.
Oil & GasReservoir EngineeringReserves
69
Semiconductor Yield Loss Root Cause
A semiconductor fab has 4 months of wafer test data from a DRAM production line including 380 process parameters, electrical test results, and bin code maps, and needs to identify which process parameters are driving a recent 3% yield drop that appeared after an equipment change.
Quyi performs spatial wafer map analysis to identify the geometric pattern of yield loss (edge, center, random, clustered), applies principal component analysis to the 380 process parameters to identify latent process signatures, runs Wilcoxon rank-sum tests comparing each process parameter distribution before vs. after the equipment change date, applies sparse logistic regression (LASSO) with the before/after binary as outcome to identify the smallest parameter set discriminating the two periods, generates a change impact timeline showing when parameter distributions shifted, analyzes die-level bin code distributions to identify which failure modes are driving the yield loss, and produces a failure analysis report with wafer map visualizations, key parameter shift plots, identified root cause parameters, and recommended process specification adjustments.
Analyze yield loss root cause in my 4-month DRAM wafer test dataset (380 process parameters). Run spatial wafer map analysis, PCA on process parameters, Wilcoxon tests comparing pre/post equipment change, LASSO to identify discriminating parameters, timeline shift detection, bin code failure mode analysis, and produce a failure analysis report with map visualizations, parameter shift plots, and process spec recommendations.
SemiconductorYieldManufacturing
70
Telecom Network Performance Degradation
A network operations center has hourly KPI data from 2,800 LTE cell sites over 6 months including accessibility, retainability, throughput, and handover success rates, and needs to identify cells with degrading performance trends before customer complaints escalate, and correlate degradations with maintenance events.
Quyi calculates composite network health scores for each cell site by normalizing and weighting the 8 KPIs against operator-defined benchmarks, applies CUSUM change-point detection to each cell's time series to flag degradation onset, correlates detected change-points with maintenance event logs to classify degradations as maintenance-induced vs. spontaneous, clusters cells by degradation profile using DTW-based k-medoids clustering, trains a random forest model predicting customer complaint probability from network KPI features, generates a network operations heat map ranking cells by degradation severity and complaint risk, and produces an NOC operations report with cell-level risk tiers, root cause classification, maintenance correlation analysis, and an automated daily alerting threshold recommendation.
Analyze LTE network KPIs for 2,800 cell sites over 6 months. Score sites with composite health index, apply CUSUM change-point detection, correlate degradations with maintenance events, cluster by degradation profile via DTW k-medoids, predict complaint probability with random forest, generate a network operations heat map, and produce an NOC report with risk tiers, root cause classification, and alerting threshold recommendations.
TelecomNetwork OperationsAnomaly
71
Investment Portfolio Factor Attribution
An institutional asset manager needs a factor attribution report explaining the sources of active return and risk for a $2B equity portfolio relative to its benchmark, decomposed into the Fama-French 5-factor model exposures, sector tilts, and stock selection — for the quarterly investment committee.
Quyi calculates portfolio and benchmark weights and returns for each period, estimates factor exposures using rolling 36-month Fama-French 5-factor regressions (market, size, value, profitability, investment), decomposes the active return into factor contribution, sector allocation effect, selection effect, and interaction effect using the Brinson-Hood-Beebower attribution framework, calculates tracking error, information ratio, and hit rate over rolling 12-month windows, runs a performance persistence test to evaluate skill vs. luck, generates a factor exposure drift chart over time, and produces a quarterly investment committee attribution report with performance decomposition waterfall, factor exposure heatmap, rolling IR chart, and a CIO-ready narrative comparing the manager's style drift to stated investment mandate.
Run Fama-French 5-factor attribution for my $2B equity portfolio vs. benchmark. Estimate rolling factor exposures, apply BHB decomposition into factor/sector/selection effects, calculate tracking error, IR, and hit rate over rolling 12-month windows, test performance persistence, generate exposure drift chart, and produce a quarterly IC report with performance waterfall, factor heatmap, rolling IR, and CIO narrative.
Asset ManagementFactor AttributionPortfolio
72
Food Safety Contamination Traceability
A food manufacturer has lot traceability records covering ingredients from 85 suppliers through 6 processing steps to 4,200 retail distribution points, and needs to identify the contamination source of a product recall and model the downstream distribution scope to guide the recall boundary decision.
Quyi constructs a directed supply chain graph connecting raw material lot numbers through transformation steps to finished product lot codes and distribution records, applies graph traversal algorithms to identify all finished product lots that share provenance with the suspect ingredient lots, calculates the minimum and maximum possible contamination scope under different assumptions about cross-contamination probability, generates a traceability network visualization showing the contamination propagation tree, estimates the number of consumer units potentially affected and their geographic distribution, calculates the financial exposure of a narrow vs. broad recall boundary, and produces a food safety recall boundary report with the contamination source analysis, affected lot list, geographic impact map, and a decision matrix comparing narrow vs. broad recall costs against risk exposure — formatted for the FSMA-required record submission.
Trace contamination through my supply chain graph (85 suppliers, 6 processing steps, 4,200 distribution points). Build directed graph of lot provenance, traverse to identify all affected finished lots, estimate contamination scope under cross-contamination assumptions, generate contamination propagation visualization, calculate consumer units at risk by geography, model narrow vs. broad recall financial exposure, and produce a FSMA-compliant recall boundary report.
Food SafetyTraceabilityRecall
73
Legal Contract Risk Scoring
A general counsel needs a standardized risk scoring system that can assess a portfolio of 600 vendor contracts for liability exposure, termination risk, and data privacy obligations — identifying which contracts require immediate renegotiation before the annual budget cycle.
Quyi extracts and scores 22 contractual risk dimensions from the uploaded contract PDFs including uncapped liability clauses, auto-renewal terms, termination-for-convenience restrictions, GDPR/CCPA data processing obligations, governing law jurisdiction risk, SLA penalty structures, IP ownership ambiguities, and force majeure scope, assigns risk scores using a weighted framework calibrated to the company's risk appetite document, clusters contracts by risk profile to identify systemic vendor class risks, calculates portfolio aggregate liability exposure under pessimistic scenario assumptions, ranks contracts by composite risk score and renegotiation urgency, and produces a legal risk report with contract-level scorecards, portfolio heat map, top-20 contracts flagged for immediate action, and a model renegotiation priority plan with estimated risk reduction per action.
Score risk for my 600 vendor contract portfolio. Extract 22 risk dimensions (liability caps, auto-renewal, termination rights, GDPR obligations, SLA penalties, IP ownership) from uploaded PDFs, apply weighted scoring against my risk appetite document, cluster by risk profile, calculate aggregate liability exposure, rank by urgency, and produce a legal risk report with scorecards, portfolio heat map, top-20 immediate action list, and renegotiation priority plan.
LegalContractRisk
74
Clinical Genomics Variant Prioritization
A clinical genomics lab has a whole-exome sequencing VCF file for a patient with a rare undiagnosed disease and needs to prioritize candidate pathogenic variants from 18,000 called variants down to a clinically actionable shortlist for physician review within a 72-hour turnaround.
Quyi filters variants by quality metrics, allele frequency (removing population-common variants above 1% in gnomAD), functional annotation (prioritizing exonic, splicing, and regulatory variants), applies ACMG/AMP classification criteria to assign pathogenicity scores using consequence prediction from SIFT/PolyPhen-2/CADD, integrates phenotype similarity between the patient's clinical features and known disease gene phenotypes using HPO-based scoring, applies inheritance model filtering (de novo, autosomal recessive, X-linked) consistent with the family structure, cross-references candidate genes against ClinVar and OMIM, and produces a clinical variant report ranking the top 10 candidate variants with ACMG classification, supporting evidence summary, candidate diagnosis, and a physician-ready interpretation note — formatted to ACMG secondary findings guidelines.
Prioritize pathogenic variants in my patient's WES VCF. Filter by quality and gnomAD allele frequency, annotate with SIFT/PolyPhen-2/CADD, apply ACMG/AMP criteria, integrate HPO phenotype similarity scoring, apply inheritance model filters from family structure, cross-reference ClinVar and OMIM, and produce a clinical variant report with top-10 candidates, ACMG classifications, and a physician interpretation note.
Clinical GenomicsRare DiseaseBioinformatics
75
Executive KPI Dashboard Automated Report
A CFO needs a monthly automated analysis of the company's 42 financial and operational KPIs that goes beyond raw numbers — identifying which metrics are trending toward targets, which are at risk, what the statistical significance of recent changes is, and what the most likely drivers are based on correlations in the operational data.
Quyi ingests the current and prior-period KPI dataset, calculates month-over-month and year-over-year changes with statistical significance testing (distinguishing real trends from noise using control chart rules), applies seasonal adjustment to metrics with known cyclicality, runs correlation network analysis to identify which operational KPIs are leading indicators for lagging financial metrics, generates a traffic-light dashboard scoring each KPI against its target with trend direction, applies anomaly detection to flag metrics that are deviating from their historical distribution in ways that aren't explained by seasonal patterns, and produces a CFO-ready executive analytics report with a one-page KPI scorecard, narrative summary of the 5 most significant changes this period, leading indicator alerts, and a full appendix with underlying trends and statistical test results — ready to paste directly into the board pack.
Generate the monthly executive KPI analysis for my 42 financial and operational metrics. Calculate MoM/YoY with significance tests, seasonally adjust cyclical metrics, run correlation network to identify leading indicators for financial KPIs, build traffic-light scorecard vs. targets, flag anomalies beyond seasonal patterns, and produce a CFO-ready board pack section with one-page scorecard, narrative of top-5 changes, leading indicator alerts, and statistical appendix.
FinanceExecutiveAutomation
🌍 Social Sciences & Humanities Economy · Anthropology · Politics
76
Inequality Decomposition and Gini Trend
An economist studying three decades of household income surveys from a developing country needs to quantify how income inequality has evolved, decompose it by urban–rural and formal–informal sector divides, and test whether inequality-reducing policies introduced in 2010 had a measurable effect.
Quyi calculates Gini coefficients and Theil-T indices for each survey wave, applies the Shorrocks decomposition to quantify each subgroup's contribution to total inequality, runs a regression discontinuity design around the 2010 policy year to estimate the causal effect on inequality, generates Lorenz curves for each decade overlaid on a single chart, computes growth incidence curves to show which percentiles benefited most, and produces an economic inequality report with trend decomposition tables, RD estimate with bandwidth sensitivity, and a policy effectiveness section with Gini confidence intervals.
Analyze 30 years of household income survey data. Calculate Gini and Theil-T per wave, apply Shorrocks decomposition by urban/rural and formal/informal sectors, run RD design around the 2010 policy change, generate overlaid Lorenz curves and growth incidence curves, and produce an inequality report with decomposition tables and RD policy effect estimate.
EconomicsInequalityPolicy
77
Intergenerational Social Mobility Analysis
A sociologist has linked parent–child occupational data for 8,400 family pairs across three generations and needs to measure absolute and relative intergenerational mobility, compare mobility rates across birth cohorts, and identify whether parental education or income is the stronger mobility predictor.
Quyi constructs occupational prestige scores (ISEI) for parents and children, builds transition matrices between occupational quintiles for each generation pair, calculates absolute upward mobility rates and rank-rank correlations (intergenerational income elasticity), decomposes the mobility gap between high- and low-education-origin families using Blinder-Oaxaca, runs logistic regression models predicting top-quintile attainment from parental income, education, and class separately, generates mobility heatmaps and rank correlation trend charts, and produces a stratification report formatted for the American Sociological Review.
Analyze intergenerational mobility for 8,400 parent–child pairs across 3 generations. Build occupational quintile transition matrices, calculate rank-rank correlations and absolute upward mobility rates, decompose mobility gaps with Blinder-Oaxaca, regress top-quintile attainment on parental income/education/class, generate mobility heatmaps and trend charts, and write an ASR-style stratification report.
SociologyMobilityStratification
78
Electoral Voting Behavior Regression
A political scientist has municipality-level electoral results for five consecutive elections, combined with census data on demographics, unemployment, and migration, and needs to explain the rise of a populist party using economic voting theory while accounting for spatial autocorrelation.
Quyi tests for spatial autocorrelation in the vote share residuals using Moran's I, selects between spatial lag and spatial error models using Lagrange Multiplier tests, fits the winning spatial econometric model regressing populist vote share on unemployment change, foreign-born population, median age, and rurality, calculates marginal effects, runs a Geographically Weighted Regression to identify spatial variation in the unemployment coefficient, generates a choropleth map of predicted vote share and GWR coefficient surfaces, and produces a political science results section explaining how economic grievances and cultural threat interact geographically to predict the populist vote.
Model populist party vote share across municipalities in 5 elections. Test spatial autocorrelation with Moran's I, select spatial lag vs. spatial error model with LM tests, fit spatial regression on unemployment, migration, age, and rurality, run GWR for spatial coefficient variation, generate choropleth and coefficient surface maps, and write a political science results section on economic voting and spatial heterogeneity.
Political ScienceSpatialElectoral
79
Anthropological Kinship Network Analysis
An anthropologist has genealogical records for 620 individuals across 12 generations of a highland community and needs to quantify kinship structure, identify preferential marriage patterns, and test whether social status is transmitted through patrilineal or bilateral descent.
Quyi constructs a directed kinship graph encoding parent–child and marriage ties, calculates coefficients of relatedness for all pairs, detects preferential marriage patterns (cross-cousin, parallel-cousin) by comparing observed frequencies to a random-mating null model with permutation tests, builds logistic regression models predicting high-status marriages from patrilineal vs. matrilineal ancestry scores, measures status transmission across generations using rank correlations by descent line, generates a genealogical network visualization with status color-coding, and produces a kinship analysis report with marriage preference statistics, descent transmission coefficients, and a discussion of social structure compared to theoretical descent models.
Analyze kinship structure for 620 individuals across 12 generations. Build a directed kinship graph, calculate relatedness coefficients, test preferential marriage patterns against random-mating null with permutations, model high-status marriages by patrilineal vs. matrilineal ancestry, measure status transmission by descent line, generate a genealogical network visualization, and write a kinship anthropology report.
AnthropologyKinshipSocial Structure
80
Criminology Recidivism Risk and Disparity
A criminologist has administrative records on 6,200 released prisoners including pre-incarceration demographics, conviction type, sentence length, program participation, and 3-year re-arrest outcomes, and needs a recidivism prediction model that is both accurate and tested for racial disparity.
Quyi trains gradient boosting and logistic regression models predicting 3-year re-arrest, evaluates discrimination with AUROC and calibration by demographic group, applies the COMPAS-comparison fairness audit testing equalized odds, demographic parity, and individual fairness simultaneously, decomposes the racial prediction gap using a Kitagawa-Oaxaca decomposition separating endowment from coefficient effects, tests whether program participation significantly reduces recidivism after matching on propensity score, and produces a criminology policy report with model performance tables, fairness metric comparisons, disparity decomposition, and program effectiveness estimates with policy implications for re-entry support.
Build a 3-year recidivism prediction model for 6,200 released prisoners. Train gradient boosting and logistic regression, evaluate AUROC and calibration by race, run fairness audit (equalized odds, demographic parity, individual fairness), decompose racial prediction gap with Kitagawa-Oaxaca, test program participation effect with propensity score matching, and produce a criminology policy report with fairness metrics and program effectiveness estimates.
CriminologyFairnessPolicy
81
Urban Gentrification Index and Displacement
An urban sociologist needs to build a gentrification index for 340 census tracts in a major city using 20 years of census data, identify which tracts are actively gentrifying, and quantify the association between gentrification intensity and displacement of low-income renters.
Quyi constructs a composite gentrification index from z-score-normalized changes in median rent, educational attainment, poverty rate, and racial composition, applies k-means clustering to identify gentrifying, stable, declining, and already-gentrified tract typologies, runs a spatial panel fixed-effects regression to estimate the causal effect of gentrification index on low-income household outmigration rates controlling for housing supply and transit access, calculates displacement pressure scores for each tract, generates a faceted choropleth map showing index evolution across four census decades, and produces an urban policy report with typology maps, displacement regression tables, and housing policy recommendations.
Build a gentrification index for 340 census tracts over 20 years. Normalize and composite rent, education, poverty, and racial composition changes, cluster tracts into gentrification typologies, run spatial panel FE regression of gentrification on low-income outmigration controlling for housing supply and transit, calculate displacement pressure scores, generate choropleth maps across decades, and produce an urban policy report.
Urban SociologyHousingSpatial
82
Education Achievement Gap Longitudinal Study
An education researcher has standardized test scores for 14,000 students tracked from grade 3 through grade 10 and needs to quantify the reading and math achievement gaps by socioeconomic status, identify when gaps widen most, and evaluate whether a school-level intervention in grades 6–7 reduced them.
Quyi fits multilevel growth curve models (students nested in schools nested in districts) estimating grade-by-grade achievement trajectories separately for low, middle, and high SES quintiles, calculates standardized gap effect sizes at each grade with confidence intervals, applies a difference-in-differences design comparing intervention and matched comparison schools in the grades 6–7 window, tests for heterogeneous effects by baseline SES and prior performance, generates gap trajectory plots with intervention period shading, and produces an education policy report with multilevel model tables, gap trend figures, DiD treatment effect estimates, and equity implications.
Analyze reading and math achievement gaps for 14,000 students grades 3–10 by SES quintile. Fit multilevel growth curve models (students/schools/districts), calculate gap effect sizes per grade with CIs, run DiD for grades 6–7 intervention vs. matched schools, test SES and prior achievement heterogeneity, generate gap trajectory plots, and produce an education equity policy report.
EducationAchievement GapPolicy
83
Macroeconomic SVAR and Fiscal Multiplier
A macroeconomist needs to estimate the fiscal multiplier — the output effect of a 1% GDP government spending shock — for a panel of 24 OECD countries over 40 years, controlling for monetary policy stance, identifying structural shocks, and testing whether the multiplier varies with the business cycle.
Quyi estimates a Structural Vector Autoregression (SVAR) with Blanchard-Perotti zero and sign restrictions to identify exogenous fiscal shocks, computes impulse response functions with 68% and 90% bootstrap confidence bands for output, consumption, and investment following a spending shock, calculates the cumulative fiscal multiplier at 1, 2, 4, and 8 quarters, tests state-dependence by interacting shocks with a recession indicator using Jorda local projections, performs panel heterogeneity analysis identifying countries with significantly different multipliers, and produces a macroeconomics results section with IRF charts, multiplier tables, and state-dependence figures — formatted for a top-5 economics journal submission.
Estimate fiscal multipliers for 24 OECD countries over 40 years using SVAR with Blanchard-Perotti identification. Compute IRFs with 68%/90% bootstrap CI bands, calculate cumulative multipliers at 1/2/4/8 quarters, test recession state-dependence with Jorda local projections, analyze panel heterogeneity, and produce a top-5 economics journal results section with IRF charts and multiplier tables.
MacroeconomicsSVARFiscal Policy
84
Behavioral Economics Loss Aversion Experiment
A behavioral economist has choice data from 480 participants in an incentivized lottery experiment designed to elicit individual loss aversion parameters, and needs to estimate the distribution of lambda (loss aversion coefficient) across the population, test for demographic predictors, and compare fit of prospect theory versus expected utility.
Quyi fits a cumulative prospect theory model to each participant's choice sequence using maximum likelihood, estimates individual loss aversion (λ), probability weighting (γ), and value function curvature (α) parameters, tests model identifiability with a Monte Carlo recovery simulation, compares prospect theory against expected utility and rank-dependent utility using BIC model selection, runs quantile regressions predicting λ from age, gender, income, financial literacy, and risk domain, generates lambda distribution histograms, parameter correlation plots, and individual-level model fit diagnostics, and produces a behavioral economics paper with structural estimation tables, model comparison, and demographic predictors of loss aversion.
Fit cumulative prospect theory to choice data from 480 participants. Estimate individual λ, γ, and α via MLE, run Monte Carlo recovery to test identifiability, compare CPT vs. EU vs. RDU with BIC, regress λ on demographics with quantile regression, generate lambda distribution and parameter correlation plots, and produce a behavioral economics paper with structural estimation tables and demographic predictor analysis.
Behavioral EconomicsProspect TheoryStructural Estimation
85
Migration Flow Gravity Model
A demographer studying internal migration has origin-destination flow matrices for 120 regions over 15 years and needs to estimate a gravity model explaining migration intensity, decompose push and pull factors, and project population redistribution under two economic scenarios.
Quyi estimates a Poisson pseudo-maximum-likelihood gravity model with origin and destination fixed effects, distance decay, contiguity, and economic differentials (wage, unemployment, housing cost), tests for spatial dependence in residuals, computes semi-elasticities for each push/pull factor with 95% CIs, decomposes observed migration into structural vs. friction components, projects 10-year population redistribution under optimistic and pessimistic GDP growth scenarios by applying the estimated model to projected economic differentials, generates an origin-destination flow map, population change choropleth, and a demography report with gravity model tables, elasticity rankings, and scenario comparison maps.
Estimate a PPML gravity model for internal migration across 120 regions over 15 years. Include distance decay, contiguity, wage/unemployment/housing differentials with origin-destination FE, test spatial residual dependence, compute semi-elasticities, decompose structural vs. friction components, project 10-year redistribution under two GDP scenarios, generate OD flow and population change maps, and write a demography report.
DemographyMigrationGravity Model
86
Cultural Evolution Meme Tracking in News
A cultural sociologist has 8 years of daily headline text from 24 news outlets across 6 countries and needs to track how political frames and cultural narratives emerge, spread transnationally, and decay — and whether conservative and progressive media ecosystems evolve independently.
Quyi trains dynamic topic models (DTM) to track how topic prevalence and framing evolve monthly, measures narrative contagion between outlets using cross-correlation of topic time series, builds a media ecosystem network with outlets as nodes and narrative co-adoption as edges, applies Granger causality tests to identify which outlets lead vs. follow specific narratives, detects ideological polarization by measuring cosine distance between conservative and progressive outlet topic vectors over time, identifies the top 10 narratives that crossed ideological boundaries, and produces a cultural sociology report with topic prevalence charts, transnational diffusion network, polarization trend line, and narrative bridging analysis.
Analyze 8 years of daily headlines from 24 news outlets across 6 countries with dynamic topic models. Track monthly topic evolution, measure cross-outlet narrative contagion with cross-correlation, build media ecosystem network, apply Granger causality for narrative leadership, measure conservative vs. progressive cosine distance over time, identify ideology-crossing narratives, and produce a cultural sociology report with diffusion network and polarization trends.
Cultural SociologyMediaNLP
87
Archaeological Site Spatial Distribution
An archaeologist has GIS coordinates for 1,840 Bronze Age sites across a river basin and needs to test whether settlement patterns reflect random colonization or structured landscape use, identify site clustering near water sources, and model the relationship between site density and agricultural suitability.
Quyi runs Clark-Evans nearest-neighbor analysis and Ripley's K-function to test for spatial clustering vs. regularity against a complete spatial randomness null model, calculates kernel density estimates of site density across the landscape, fits a spatial point process model (inhomogeneous Poisson process) with distance to nearest river, slope, aspect, and soil quality covariates from uploaded GIS rasters, performs a Monte Carlo envelope test for each covariate's significance, generates a site distribution map with kernel density overlay, K-function envelope plot, and covariate partial effects, and produces an archaeological spatial analysis report with statistical test results, landscape preference profile, and settlement system interpretation.
Analyze spatial distribution of 1,840 Bronze Age sites. Run Clark-Evans and Ripley's K-function against CSR null, estimate kernel density, fit inhomogeneous Poisson process model with distance-to-river, slope, aspect, and soil quality covariates, perform Monte Carlo envelope tests, generate site map with density overlay, K-function plot, and covariate effects, and write an archaeological spatial analysis report.
ArchaeologySpatialLandscape
88
Labor Market Wage Gap Decomposition
A labor economist studying the gender wage gap has linked employer–employee panel data for 42,000 workers across 1,800 firms and needs to decompose the raw wage gap into worker characteristics, firm-level sorting, and a residual unexplained component using the Abowd-Kramarz-Margolis framework.
Quyi estimates two-way fixed effects wage equations with worker and firm fixed effects using iterative least squares on the bipartite matched data, decomposes the gender wage gap into contributions from worker human capital, firm-level pay premium sorting (women disproportionately employed at low-wage firms), within-firm discrimination, and residual, tests for assortative mating between high-ability workers and high-wage firms by gender, runs quantile regressions to detect glass ceiling effects in the upper wage distribution, generates a wage distribution plot with firm-effect decomposition, quantile gap chart, and sorting visualization, and produces a labor economics paper with AKM decomposition tables, sorting test results, and policy implications for pay transparency.
Estimate an AKM two-way fixed effects wage model on my linked employer-employee panel (42,000 workers, 1,800 firms). Decompose the gender wage gap into human capital, firm sorting, within-firm, and residual components, test worker-firm assortative mating by gender, run quantile regression for glass ceiling detection, generate decomposition and quantile gap charts, and write a labor economics paper with AKM tables and pay transparency policy implications.
Labor EconomicsGender GapAKM
89
Public Health Tobacco Control Policy Evaluation
A public health researcher needs to evaluate whether a national tobacco tax increase in 2015 reduced smoking prevalence and cigarette consumption, using a synthetic control method that constructs a counterfactual from a weighted combination of countries that did not implement the policy.
Quyi implements the synthetic control method matching on pre-treatment smoking prevalence, GDP per capita, urbanization rate, and existing tobacco regulation stringency to construct the optimal donor-pool weights, validates the synthetic control by testing pre-treatment fit quality (MSPE), runs a series of in-time and in-space placebo tests to assess the probability of observing a comparable effect by chance, calculates the Average Treatment Effect on the Treated with permutation-based inference, tests for anticipation effects in the pre-period, generates a synthetic control trajectory chart with placebo distribution and a ratio plot of the treated/control gap, and produces a health policy evaluation report with counterfactual estimates, placebo test p-values, and recommendations for scaling the policy.
Evaluate the 2015 tobacco tax policy using synthetic control. Match on pre-treatment smoking prevalence, GDP, urbanization, and regulation stringency, validate pre-treatment fit with MSPE, run in-time and in-space placebo tests, estimate ATT with permutation inference, test for anticipation effects, generate synthetic control trajectory chart with placebo distribution, and write a health policy evaluation report.
Public HealthSynthetic ControlPolicy
90
International Trade Network and Comparative Advantage
An international economist has bilateral trade flow data for 85 countries and 200 product categories over 25 years and needs to measure each country's revealed comparative advantage, identify specialization shifts, and assess whether global value chain integration has changed trade network structure.
Quyi calculates Balassa's Revealed Comparative Advantage (RCA) index for each country-product-year triple, identifies products where countries have gained or lost comparative advantage over 25 years using Mann-Kendall trend tests, builds a bipartite country-product network and computes its fitness-complexity metrics (economic complexity index), runs gravity model regressions to identify which bilateral relationships are stronger or weaker than predicted by economic fundamentals, calculates network centrality measures to identify trade hub countries, visualizes the product space as a proximity network showing specialization paths, and produces a trade economics report with RCA tables, complexity rankings, gravity model results, and GVC integration analysis.
Analyze bilateral trade flows for 85 countries, 200 products, 25 years. Calculate Balassa RCA indices, detect comparative advantage shifts with Mann-Kendall, compute economic complexity index from bipartite country-product network, run gravity regression for anomalous bilateral relationships, calculate network centrality, visualize product space proximity network, and produce a trade economics report with RCA tables, complexity rankings, and GVC integration analysis.
Trade EconomicsNetworkComplexity
🧪 Experiment Design Planning · Power · Protocols
91
Phase III RCT Design with Adaptive Interim
A pharmaceutical statistician needs to design a Phase III superiority trial comparing a new antihypertensive to standard of care, incorporating one adaptive interim analysis for sample size re-estimation, ensuring Type I error control at α=0.025, and producing the statistical analysis plan.
Quyi calculates the initial sample size assuming 80% power for a 5 mmHg treatment difference with SD=18 mmHg using the standard parallel-group formula, designs the interim analysis at 50% information fraction with a Cui-Hung-Wang adaptive test to allow sample size re-estimation without inflating Type I error, simulates 10,000 trials under the null and alternative hypotheses to validate operating characteristics, calculates the conditional power at interim for various assumed effect sizes, specifies stratified randomization blocks by center and baseline SBP category, generates a complete statistical analysis plan document covering primary/secondary endpoints, multiplicity adjustment, handling of missing data, sensitivity analyses, and the adaptive decision rule — formatted to ICH E9(R1) estimand framework standards.
Design a Phase III adaptive RCT for antihypertensive treatment (target 5 mmHg difference, SD=18). Calculate initial sample size at 80% power, design CHW adaptive interim at 50% information for SSR, simulate 10,000 trials for Type I error and power validation, specify stratified randomization, and produce a complete ICH E9(R1)-compliant statistical analysis plan with primary/secondary endpoints, multiplicity adjustment, and missing data strategy.
Clinical TrialAdaptive DesignSAP
92
Response Surface Design for Chemical Optimization
A chemical engineer needs to optimize a reaction yield as a function of three continuous factors — temperature (60–120°C), pH (4–8), and catalyst concentration (0.1–2%) — using a minimum number of experimental runs and without assuming a linear response surface.
Quyi generates a Box-Behnken Design (BBD) for 3 factors requiring only 15 runs (vs. 27 for a full 3³ factorial), fits a second-order polynomial response surface model to the yield responses, checks model adequacy with ANOVA, lack-of-fit test, and residual diagnostics, calculates the gradient to locate the stationary point and classifies it as a maximum, minimum, or saddle point using the canonical analysis of the Hessian matrix, generates 3D surface plots and contour plots for all factor pair combinations, performs desirability function optimization to find the operating conditions maximizing yield above 90%, and produces an experiment report with the design matrix, RSM model coefficients, optimal conditions with prediction intervals, and a validation run recommendation.
Design a Box-Behnken experiment for temperature (60–120°C), pH (4–8), and catalyst conc. (0.1–2%) to optimize reaction yield. Fit second-order RSM, check adequacy with ANOVA and lack-of-fit, locate stationary point via canonical analysis, generate 3D surface and contour plots, run desirability optimization for yield >90%, and produce an RSM report with design matrix, model coefficients, and optimal conditions with prediction intervals.
Chemical EngineeringDoEOptimization
93
Fractional Factorial Screening for Manufacturing
A manufacturing engineer has identified 7 potential process factors that may affect surface roughness in a milling operation and needs to screen them efficiently with a minimum number of runs to identify the vital few before moving to a full optimization experiment.
Quyi generates a Resolution IV 2⁷⁻³ fractional factorial design requiring only 16 runs (vs. 128 for the full factorial), constructs the confounding pattern table showing which main effects are aliased with which two-factor interactions, fits the main effects model to the roughness data, applies a Lenth pseudo-standard-error test to identify statistically significant effects without replicate runs, generates a half-normal probability plot of effect estimates to visually separate active from inactive factors, identifies the 2–3 most influential factors for follow-up experimentation, and produces a screening analysis report with the design matrix, confounding structure, effect estimates, Lenth significance test results, and a recommendation for the follow-on optimization design.
Design a 2⁷⁻³ Resolution IV fractional factorial for screening 7 milling process factors on surface roughness. Generate 16-run design, show confounding pattern, fit main effects model, apply Lenth PSE test for significance, generate half-normal probability plot, identify 2–3 influential factors, and produce a screening report with design matrix, confounding table, effect estimates, and follow-on optimization recommendation.
ManufacturingDoEScreening
94
Crossover Trial Design for Bioequivalence
A regulatory pharmacokineticist needs to design a 2×2 crossover bioequivalence study comparing a generic to the reference formulation, determine the required sample size accounting for within-subject variability, and specify the complete analysis plan including the 90% CI acceptance criteria.
Quyi estimates the required sample size using the crossover power formula with the within-subject CV from a pilot study, designs the treatment sequence allocation (AB/BA) with washout period specification, defines the primary PK endpoints (AUC₀-∞, AUC₀-t, Cmax), specifies the linear mixed model for the log-transformed PK parameters with sequence, period, and subject effects, defines the 90% CI method for the geometric mean ratio, calculates the acceptance range (80.00–125.00%) and the probability of passing BE given assumed true ratio and CV, checks for carry-over effect test specification, generates a simulated dataset with the proposed design to validate the analysis approach, and produces a bioequivalence study protocol formatted to EMA and FDA guidance standards.
Design a 2×2 crossover bioequivalence study comparing generic vs. reference formulation. Calculate sample size from pilot within-subject CV, specify AB/BA sequences with washout, define LMM for log-transformed AUC and Cmax, specify 90% CI with 80–125% acceptance range, calculate BE passing probability, simulate validation dataset, and produce an EMA/FDA-compliant bioequivalence study protocol.
PharmacokineticsBioequivalenceRegulatory
95
Bayesian Adaptive Dose-Finding (Phase I)
An oncology statistician needs to design a Phase I dose-escalation trial for a novel immunotherapy using a model-based approach that improves dose recommendation over the 3+3 rule, targets the MTD at the 25th percentile of the dose-toxicity curve, and incorporates prior information from preclinical studies.
Quyi implements the Continual Reassessment Method (CRM) with a one-parameter power model, elicits prior skeleton probabilities from the uploaded preclinical data, determines the initial dose level using the prior, simulates 1,000 trial replications under multiple true toxicity scenarios (underdosing, target, overdosing) to evaluate operating characteristics including the percentage of times each dose is recommended as MTD, the percentage of patients treated at each dose level, and the probability of overdosing, compares CRM to the standard 3+3 design on accuracy and patient safety metrics, generates operating characteristic tables, dose-toxicity curve plots with posterior uncertainty at key decision points, and produces a Phase I protocol document with CRM specification, prior skeleton, decision rules, stopping criteria, and regulatory justification for model-based design.
Design a Bayesian CRM Phase I dose-finding trial for immunotherapy. Specify power model with prior skeleton from preclinical data, target MTD at 25th DLT percentile, simulate 1,000 replications under 3 toxicity scenarios for operating characteristics (recommendation accuracy, overdosing rate, patient allocation), compare to 3+3 design, and produce a Phase I protocol with CRM specification, decision rules, and regulatory justification.
OncologyPhase IBayesian Adaptive
96
Cluster Randomized Trial for School Intervention
An education researcher plans a cluster randomized trial testing whether a teacher training program improves student math scores across 60 schools (30 intervention, 30 control), needs to calculate sample size accounting for clustering, and must design the randomization to ensure balance on key school-level confounders.
Quyi calculates the design effect from the intraclass correlation coefficient (ICC=0.12 from pilot data), derives the required number of schools and students per school for 80% power to detect a 0.25 SD treatment effect, designs covariate-constrained randomization using rerandomization to achieve balance on school size, baseline math scores, urban/rural classification, and socioeconomic index, specifies the multilevel model (students nested in schools) for the primary analysis, generates randomization diagnostics showing balance achieved vs. simple random assignment, performs a sensitivity analysis over the ICC range 0.05–0.25, and produces a complete trial protocol including CONSORT cluster extension checklist, power calculation narrative, randomization procedure description, and analysis plan with primary and secondary outcomes.
Design a 60-school cluster RCT for a teacher training program. Calculate sample size with ICC=0.12 design effect for 0.25 SD effect at 80% power, design covariate-constrained rerandomization for balance on 4 school characteristics, specify multilevel analysis model, run ICC sensitivity analysis (0.05–0.25), and produce a CONSORT cluster extension-compliant trial protocol with power narrative and randomization procedure.
EducationCluster RCTMultilevel
97
Agricultural Latin Square and Mixed Model ANOVA
An agronomist needs to test 5 nitrogen fertilizer treatments across a field with known gradients in two directions, minimize confounding from soil fertility variation, and estimate treatment effects with maximum efficiency using a design that accounts for both row and column blocking.
Quyi generates a 5×5 Latin square design layout randomizing treatments across rows and columns, calculates the relative efficiency of the Latin square vs. a completely randomized design and a randomized complete block design using the efficiency formula, fits a linear mixed model with treatment as fixed effect and row and column as random blocking effects using REML estimation, conducts an F-test for treatment differences, runs Tukey-Kramer pairwise comparisons with multiplicity correction, calculates the efficiency gain from blocking (estimated information per unit relative to CRD), generates a field layout visualization, treatment mean comparison plot with connecting letters report, and produces an agricultural research report with Latin square design rationale, ANOVA table, pairwise comparisons, and a fertilizer recommendation based on the results.
Design a 5×5 Latin square for 5 nitrogen fertilizer treatments in a field with bidirectional gradients. Calculate design efficiency vs. CRD and RCBD, fit linear mixed model with REML for row and column blocking, run Tukey-Kramer comparisons, calculate blocking efficiency gain, generate field layout visualization and CLR plot, and produce an agricultural report with ANOVA table, pairwise comparisons, and fertilizer recommendation.
AgricultureLatin SquareDoE
98
N-of-1 Trial Design for Personalized Medicine
A clinician wants to determine whether a specific patient with chronic migraine benefits from preventive treatment A versus treatment B using a rigorous within-person repeated crossover design, while detecting a minimum clinically important difference of 3 migraine days per month.
Quyi designs a 3-pair N-of-1 trial (6 periods of AB or BA) with 4-week treatment periods and 1-week washout, calculates the required number of pairs for 80% power to detect a 3 migraine-day difference with within-person SD estimated from historical data, generates the randomized treatment sequence, specifies the primary analysis as a paired t-test on period means with carry-over assessment, designs the patient-reported outcome collection schedule, performs a Bayesian analysis of the trial results using a hierarchical model that pools information from multiple N-of-1 patients when available, generates migraine frequency time series plots by period, a treatment comparison forest plot, and produces a precision medicine report explaining the evidence for or against individualized treatment effect and a clinical recommendation.
Design a 3-pair N-of-1 trial for migraine prevention (Treatment A vs. B). Calculate pairs needed for 80% power (MCID=3 days/month), generate randomized sequence with washout specification, specify paired t-test with carry-over assessment, run Bayesian hierarchical analysis pooling N-of-1 patients, generate period time series and treatment comparison forest plot, and produce a precision medicine report with individualized treatment effect evidence and clinical recommendation.
Precision MedicineN-of-1Bayesian
99
Regression Discontinuity Threshold Design
A policy researcher wants to evaluate the causal effect of a means-tested scholarship program by exploiting the sharp income cutoff that determines eligibility, and needs to design the analysis to ensure valid causal inference and assess robustness to bandwidth choice and functional form assumptions.
Quyi verifies the RD validity by testing for density manipulation at the income threshold using the McCrary density test, checks for covariate discontinuities (baseline characteristics should be continuous at the threshold), selects the optimal bandwidth using the Calonico-Cattaneo-Titiunik (CCT) procedure and local polynomial regression, estimates the Local Average Treatment Effect at the threshold with robust bias-corrected confidence intervals, performs the donut-hole robustness check excluding observations very close to the threshold (potentially manipulated), plots the RD graphical visualization with binscatter and polynomial fit, tests functional form sensitivity using linear, quadratic, and cubic specifications, and produces a policy evaluation report with the RD estimate, bandwidth sensitivity table, placebo tests at non-threshold cutoffs, and policy interpretation of the local treatment effect.
Evaluate a scholarship program using RD design around income threshold. Run McCrary density test, check covariate discontinuities, select optimal bandwidth with CCT procedure, estimate LATE with bias-corrected CIs, perform donut-hole robustness check, generate RD visualization with binscatter, test functional form sensitivity (linear/quadratic/cubic), run placebo cutoffs, and produce a policy evaluation report with RD estimate table and interpretation.
Policy EvaluationRD DesignCausal Inference
100
Conjoint Analysis for Product Attribute Valuation
A market researcher needs to understand which product attributes (price, battery life, screen size, brand) drive smartphone purchase decisions and quantify consumers' willingness to pay for each attribute improvement using a discrete choice experiment.
Quyi generates an orthogonal or D-optimal choice design with 12 choice tasks across 4 attributes at 3–4 levels each, fits a mixed logit (random parameters logit) model allowing for preference heterogeneity across respondents, estimates part-worth utilities and their distributions across the population, calculates willingness to pay for each attribute level as the ratio of the attribute coefficient to the price coefficient with delta-method standard errors, runs market simulation to predict market share for new product configurations, applies latent class analysis to identify 3–4 consumer segments with distinct preference profiles, generates attribute importance rankings, WTP confidence interval plots, and market simulation scenarios, and produces a consumer insights report with model coefficients, WTP table, segment profiles, and product configuration recommendations for each target segment.
Analyze my conjoint experiment (12 tasks, 4 attributes: price/battery/screen/brand). Fit mixed logit with preference heterogeneity, estimate part-worth utilities and WTP per attribute level with delta-method SEs, run market simulation for new configurations, apply latent class analysis for 3–4 segments, generate WTP CI plots and attribute importance rankings, and produce a consumer insights report with WTP table, segment profiles, and product configuration recommendations.
Market ResearchConjointWTP
📊 Sampling & Population Studies Survey Design · Estimation · Inference
Presidential Election Poll — Complete Sampling Design
Full end-to-end example: stratification · allocation · weighting · MoE · reporting
A polling organization needs to produce a nationally representative presidential approval poll for a country of 38 million eligible voters across 8 geographic regions, 5 age groups, 3 educational strata, and 2 gender categories. The poll must achieve ±2.5 percentage points margin of error at 95% confidence, account for differential response rates across demographic groups, and correct for known biases from prior elections — all within a budget of 1,800 telephone interviews.
What Quyi produces:

Phase 1 — Sample Size & Stratification Design. Quyi calculates the required sample size for ±2.5% MoE at 95% CI assuming maximum variance (p=0.5): n = z²·p·(1-p)/e² = 1,537 before design effect adjustment. Applying a design effect of DEFF=1.15 (estimated from previous survey's intraclass correlation), the required effective sample is 1,768, rounded to 1,800 interviews. The population is stratified by 8 geographic regions × 3 education strata, yielding 24 strata. Sample is allocated proportionally to population size with a minimum floor of 30 per stratum to ensure regional estimates. Output: a 24-row stratification table with population N, sample n, and sampling fraction for each stratum.

Phase 2 — Differential Response Rate Adjustment. Historical response rates by demographic group (18–24: 18%, 65+: 52%, no-college: 22%, college: 45%) are applied to calculate the oversampling factors needed to achieve the target n per cell after expected non-response. Quyi calculates gross contact targets per stratum accounting for each group's response rate, generates a field contact allocation table, and flags 3 strata where budget constraints require deliberate undersampling with post-hoc weighting compensation.

Phase 3 — Post-Stratification Weighting. Once data is collected, Quyi builds a raking (iterative proportional fitting) weight matrix aligning the sample margins simultaneously to 6 census control variables: region, age group, gender, education, urbanicity, and past-vote recall. Weight trimming at the 95th percentile prevents any single respondent from carrying disproportionate influence. The effective sample size after weighting is calculated as n_eff = (Σwᵢ)²/Σwᵢ², and the design effect from weighting is reported.

Phase 4 — Estimates and Confidence Intervals. Weighted voting intention estimates are calculated for each candidate with standard errors using the Taylor linearization method for complex survey data (not simple random sample SEs). The 95% confidence intervals and margin of error for the topline estimates and each demographic subgroup are reported. Subgroup margins of error (e.g., 18–24 year olds, n_eff=85) are clearly flagged as wider than the topline MoE.

Phase 5 — Uncertainty and Herding Check. Bootstrap resampling (1,000 replicates) validates the analytical standard errors. Quyi compares the sample distribution of demographic covariates to census benchmarks and calculates a χ² balance test before and after weighting. A house effects analysis comparing the weighted results to 5 prior polls is generated to detect systematic bias (herding detection).

Output deliverables: A polling methodology report (AAPOR transparency checklist), topline results table with weighted estimates and MoE per candidate, demographic crosstab tables with subgroup MoE warnings, weight distribution histogram, balance diagnostic table, bootstrap uncertainty validation, and a press release-ready summary table.
Design and analyze a presidential election poll for 38M eligible voters in 8 regions. Calculate required n for ±2.5% MoE at 95% CI with DEFF=1.15, design 24-stratum proportional allocation with minimum floor of 30, calculate oversampling factors from historical response rates by demographic, build a raking weight matrix to 6 census controls with weight trimming at 95th percentile, calculate complex-survey Taylor linearization SEs and MoE per subgroup, validate with 1,000-replicate bootstrap, run balance diagnostic and house effects analysis, and produce an AAPOR-compliant methodology report with topline results table, demographic crosstabs, and press release summary.
PollingComplex SurveyWeightingElectoralSampling Design
102
Genomic Population Sampling Strategy
A population geneticist planning a study of genetic diversity and admixture across 12 indigenous communities in South America needs to determine how many individuals to sample per community, ensure the sample captures the full allele frequency spectrum including rare variants, and design recruitment that minimizes relatedness-induced bias.
Quyi calculates sample size requirements under two objectives: (1) detecting alleles at frequency ≥5% with 95% probability requires n≥58 per community (using binomial CDF), (2) reliably estimating FST between community pairs with power 0.80 requires n≥30 per population using the Weir-Cockerham estimator variance formula. Quyi recommends n=60 per community for 720 total individuals, designs a relatedness screening protocol excluding second-degree or closer relatives using KING kinship coefficients, specifies a stratified spatial sampling strategy within each community territory to capture geographic substructure, calculates expected coverage of common and low-frequency variants under the proposed design using population genetic theory, and produces a genomic sampling protocol with community sample sizes, kinship exclusion criteria, geographic stratification map, power calculations for GWAS and admixture analyses, and an IRB-ready ethics section on community consent and data sovereignty.
Design a genomic sampling strategy for 12 South American indigenous communities. Calculate n to detect alleles at ≥5% frequency with 95% probability and to estimate FST with 80% power, recommend final sample size, design kinship screening with KING coefficients to exclude ≥2nd-degree relatives, specify stratified spatial sampling within territories, calculate variant coverage under the design, and produce a genomic sampling protocol with power calculations for GWAS and admixture analyses and an IRB ethics section.
Population GeneticsSampling DesignGWAS
103
Wildlife Population Capture-Recapture Estimation
An ecologist needs to estimate the abundance of a secretive bird species across a 2,400 km² reserve using camera trap data from two sampling occasions, calculate density, and determine whether existing trapping effort is sufficient to achieve a CV below 20% on the abundance estimate.
Quyi fits the Lincoln-Petersen mark-recapture estimator and the Chapman modification for small samples to the capture histories, tests the closure assumption with a Stanley-Burnham test, applies a Huggins closed-population model incorporating individual heterogeneity covariates (trap location, habitat type) to estimate detection probability p and abundance N with 95% CIs, calculates population density using an effective sampled area derived from average movement distance, runs a power analysis showing how CV(N̂) changes with additional trapping effort (n occasions and n traps), and produces an ecology report with abundance and density estimates, detection probability table, closure test results, CV-vs-effort power curve, and a field protocol recommendation for the optimal camera placement and number of sampling occasions to achieve the CV target.
Estimate bird abundance from 2-occasion camera trap data across 2,400 km². Apply Lincoln-Petersen and Chapman estimators, test closure assumption with Stanley-Burnham test, fit Huggins closed-population model with habitat heterogeneity covariates, calculate density from effective sampled area, run CV vs. effort power analysis, and produce an ecology report with abundance/density estimates, detection probability, closure test, and optimal trapping protocol recommendation.
EcologyMark-RecaptureWildlife
104
National Health Survey Complex Sample Analysis
A public health analyst has data from a nationally representative multi-stage probability survey with unequal selection probabilities, stratification, and clustering, and needs to correctly estimate prevalence of chronic conditions and their confidence intervals, accounting for the complex survey design.
Quyi loads the survey data with its design variables (strata IDs, cluster IDs, sampling weights), implements Taylor series linearization for variance estimation using the survey design structure, calculates weighted prevalence estimates for 12 chronic conditions with design-based 95% CIs, disaggregates by age, sex, region, and SES quintile using subpopulation analysis that respects the full design, runs design-adjusted logistic regression for risk factors of hypertension while accounting for clustering, compares the design-based estimates to naive (unweighted) estimates to demonstrate the magnitude of design effect bias, calculates the effective sample size per subgroup, and produces a public health surveillance report with prevalence tables, design effect diagnostics, demographic disaggregation, logistic regression odds ratios with correct SEs, and a methods note explaining the complex survey estimation approach for lay audiences.
Analyze a nationally representative multi-stage health survey with strata, clusters, and unequal weights. Estimate prevalence of 12 chronic conditions with Taylor linearization 95% CIs, disaggregate by age/sex/region/SES with subpopulation analysis, run design-adjusted logistic regression for hypertension risk factors, compare design-based vs. naive estimates for DEFF illustration, and produce a surveillance report with prevalence tables, demographic breakdowns, logistic regression ORs, and complex survey methods note.
Public HealthComplex SurveySurveillance
105
Time-Location Sampling for Hard-to-Reach Populations
A social epidemiologist needs to estimate HIV prevalence and risk behavior in a population of sex workers in an urban area who cannot be reached by standard household surveys, using a venue-based time-location sampling (TLS) methodology that produces valid population-level estimates.
Quyi generates a day-time unit (DTU) sampling frame from venue enumeration data, calculates the number of DTUs to sample for 80% power to detect a 10-percentage-point difference in HIV prevalence, designs proportional-to-size (PPS) selection of venues weighted by estimated attendance, calculates venue-level sampling weights (1/probability of selection), applies post-stratification calibration for venue type, calculates HIV prevalence with complex survey standard errors using linearization, runs a sensitivity analysis for the assumption that all venue attendees had an equal chance of being approached, generates a recruitment flow chart, venue attendance distribution, and a prevalence estimate with MoE, and produces an epidemiology report with TLS methodology justification, HIV and behavioral risk prevalence tables, subgroup analysis, and key population surveillance recommendations.
Design a time-location sampling study for sex workers in an urban area. Generate DTU sampling frame, calculate DTUs needed for 80% power, design PPS venue selection, calculate sampling weights, estimate HIV prevalence with complex survey SEs, run sensitivity analysis on equal-approach assumption, generate recruitment flowchart and attendance distribution, and produce an epidemiology report with TLS methodology, prevalence tables, and key population surveillance recommendations.
EpidemiologyKey PopulationsSampling
106
Longitudinal Cohort Attrition and Bias Analysis
A epidemiologist running a 10-year prospective cohort study has lost 34% of the original 5,200 participants to follow-up and needs to assess whether attrition is random or selective, estimate the bias introduced, and apply weighting methods to restore representativeness for the complete-case analysis.
Quyi compares baseline characteristics of completers vs. dropouts with standardized mean differences and chi-square tests, builds a logistic regression model of dropout probability on baseline variables to identify predictors of attrition, calculates inverse probability of censoring weights (IPCW) to up-weight remaining participants to represent the full original cohort, applies multiple imputation by chained equations (MICE) as a second missing data strategy, compares primary analysis estimates under complete case, IPCW, and MICE approaches to assess the sensitivity of findings to attrition, runs a Plasmode simulation using the observed data to empirically validate that the IPCW approach recovers unbiased estimates under known missingness mechanisms, and produces a study limitations report with attrition characterization, SMD balance plots before and after weighting, analysis comparison table, and STROBE-compliant missing data methods section.
Analyze 34% attrition in a 10-year cohort (n=5,200). Compare completers vs. dropouts with SMDs, model dropout probability with logistic regression, calculate IPCW weights, apply MICE as alternative, compare primary outcomes under complete case/IPCW/MICE, validate IPCW with Plasmode simulation, generate SMD balance plots before/after weighting, and produce a STROBE-compliant attrition report with missing data methods section.
Cohort StudyMissing DataIPCW
107
Adaptive Cluster Sampling for Rare Phenomenon
A botanist needs to estimate the total abundance of a rare orchid species that grows in dense but spatially scattered patches across a 500 km² study area, where conventional systematic sampling would miss most patches but surveying the entire area is infeasible.
Quyi designs an adaptive cluster sampling scheme with an initial systematic grid of 80 primary units, where any unit containing ≥1 orchid triggers sampling of all 8 Moore-neighborhood adjacent units, calculates the unbiased Horvitz-Thompson estimator for total abundance using the network-based inclusion probabilities that arise from the adaptive rule, estimates the variance of the HT estimator using the without-replacement variance formula, calculates the expected sample size and coefficient of variation under the adaptive design versus conventional systematic sampling using simulation, generates a sampling frame map showing initial grid, triggered adaptive additions, and final sampled networks, and produces a botanical survey report with HT abundance estimate, 95% CI, design comparison table, and field protocol specification with decision rules for adaptive expansion.
Design an adaptive cluster sampling study for rare orchid abundance across 500 km². Specify initial 80-unit systematic grid, adaptive trigger rule (≥1 orchid → sample 8 neighbors), calculate Horvitz-Thompson estimator with network inclusion probabilities, estimate HT variance, simulate expected sample size and CV vs. conventional systematic sampling, generate sampling frame map, and produce a botanical survey report with abundance estimate, 95% CI, and field protocol.
EcologyAdaptive SamplingRare Species
108
Household Income Survey with Optimal Allocation
A national statistics office plans a household income survey across 5 wealth strata in a country of 4.2 million households, needs to allocate a total sample of 6,000 households across strata to minimize the variance of the national mean income estimate, and must report regional and stratum-level estimates with acceptable precision.
Quyi implements Neyman optimal allocation weighting by population size and within-stratum variance (from a pilot survey), compares optimal allocation to proportional and equal allocation on expected CV of the national mean estimate, calculates the expected CV and MoE for each stratum under the optimal allocation, identifies strata where precision requirements cannot be met within the total sample budget and proposes sample reallocation strategies, designs the two-stage sampling within strata (PSU selection with PPS, then household selection within PSUs), specifies the calibration estimator incorporating auxiliary information from the census on household size and urban/rural classification, simulates 500 synthetic surveys under the proposed design to validate the analytical variance estimates, and produces a survey methodology report with allocation table, precision comparison, two-stage design specification, calibration variables, and a quality assurance protocol for field data collection.
Design a household income survey for 4.2M households across 5 wealth strata. Implement Neyman optimal allocation using pilot variance estimates, compare to proportional and equal allocation on CV, calculate stratum-level precision, design two-stage PPS sampling, specify calibration estimator with census auxiliary variables, simulate 500 synthetic surveys for variance validation, and produce a survey methodology report with allocation table, precision comparison, and field QA protocol.
Survey MethodologyOptimal AllocationNational Statistics
109
Respondent-Driven Sampling for Hidden Network
A sociologist studying informal economic networks in a city cannot identify the target population from any frame and needs to use respondent-driven sampling starting from 6 seeds, while estimating population proportions with RDS-II estimators that correct for differential recruitment probability.
Quyi implements the RDS-II Volz-Heckathorn estimator incorporating each respondent's self-reported network degree as a weight inversely proportional to their recruitment probability, assesses convergence diagnostics by plotting estimated proportions against wave number to verify the Markov chain has reached equilibrium, calculates bootstrap confidence intervals using the Salganik (2006) method that accounts for recruitment tree structure, tests for differential recruitment by key characteristic (detecting whether one group is more likely to recruit its own), runs a sensitivity analysis for seed bias by comparing estimates from the 6 seeds individually, calculates design effects versus simple random sampling, and produces a hidden population report with RDS-II prevalence estimates, convergence plots, recruitment pattern analysis, network degree distribution, and a methodological appendix justifying the equilibrium assumption.
Analyze respondent-driven sampling data from 6 seeds for a hidden population study. Apply RDS-II Volz-Heckathorn estimator with degree-based weights, assess Markov chain convergence by wave, calculate bootstrap CIs using Salganik method, test differential recruitment by group, run seed bias sensitivity with per-seed estimates, calculate design effects vs. SRS, and produce a hidden population report with RDS-II prevalence estimates, convergence plots, and methodology appendix.
RDSHidden PopulationsSocial Networks
110
Soil Contamination Spatial Sampling Design
An environmental agency needs to design a soil sampling program to characterize heavy metal contamination across an industrial brownfield site of 8.5 hectares before remediation, ensuring sufficient spatial coverage to estimate mean contamination with CV ≤15% and to detect hotspots above the regulatory action threshold.
Quyi calculates the number of samples required for the mean estimation objective using the variance estimated from a 10-point pilot survey, generates a triangular grid sampling pattern providing uniform coverage and known spatial support, supplements with adaptive hot-spot detection sampling using a two-phase design where elevated pilot readings trigger local intensification, calculates the spatial support and scale of spatial autocorrelation using variogram estimation from pilot data to justify sample spacing, estimates the probability of detecting a hotspot of given radius and concentration above threshold as a function of sample density, designs composite sampling protocols for cost reduction where individual samples are pooled before laboratory analysis, and produces a sampling design report with grid map, pilot variogram, required sample n, composite pooling scheme, phase-2 trigger criteria, and cost-precision trade-off analysis.
Design a soil contamination sampling program for an 8.5-hectare brownfield. Calculate n for CV ≤15% from pilot variance, generate triangular grid, estimate spatial autocorrelation from pilot variogram to justify spacing, design two-phase adaptive hotspot sampling with trigger criteria, calculate hotspot detection probability vs. sample density, design composite pooling protocol, and produce a design report with grid map, variogram, sample n, pooling scheme, and cost-precision trade-off.
EnvironmentalSpatial SamplingRemediation
111
Multi-Country Comparative Survey Harmonization
A cross-national research team has survey data measuring trust in institutions from 18 countries using nationally adapted question wordings, and needs to assess whether the scales are measuring the same construct across countries before making any cross-national comparisons.
Quyi fits a multi-group confirmatory factor analysis testing configural, metric, and scalar invariance across all 18 countries sequentially, identifies which specific item intercepts or loadings violate invariance using modification indices and expected parameter change statistics, applies partial scalar invariance by freeing non-invariant parameters while retaining invariant ones, recalculates which cross-national comparisons are valid under partial invariance, applies alignment optimization as an alternative to test whether country factor means can be compared despite local non-invariance, generates a cross-national invariance table with fit statistics at each level, a visualization of non-invariant items by country, and a comparative methods report specifying which countries and constructs support full vs. partial comparison — formatted as a measurement validation appendix for a cross-national journal.
Test measurement invariance for institutional trust scale across 18 countries. Fit multi-group CFA for configural, metric, and scalar invariance, identify non-invariant items with MI and EPC, apply partial invariance freeing non-invariant parameters, run alignment optimization as alternative, determine valid comparison subsets, generate invariance table and non-invariant item visualization, and produce a cross-national measurement validation appendix specifying which comparisons are supported.
Comparative SurveyMeasurementCFA
112
Bayesian Sample Size for Rare Disease Trial
A rare disease researcher faces the common challenge of designing a trial where the target population is only 400 patients worldwide, making traditional frequentist sample sizes unachievable, and needs to determine the maximum feasible sample size, specify an assurance-based design, and quantify the evidence that can realistically be gathered.
Quyi calculates the traditional frequentist n (which far exceeds the available population), designs instead a Bayesian study using a skeptical prior on treatment effect from prior studies, calculates the posterior probability of clinically meaningful benefit under multiple prior-data combinations across the feasible sample range (n=40 to n=120), derives the assurance (prior-averaged power) at each sample size, identifies the "minimum information sample" — the n at which the posterior probability of benefit exceeds a pre-specified threshold under plausible effect sizes, performs a prior sensitivity analysis showing how conclusions change from strongly skeptical to weakly informative priors, generates a Bayesian power curve, posterior distribution plots under n=80 and n=120, and produces a rare disease study design report with assurance curves, prior specification justification, regulatory strategy section discussing FDA guidance on Bayesian adaptive designs for rare diseases, and a decision framework for trial go/no-go.
Design a Bayesian rare disease trial with only 400 patients worldwide. Calculate frequentist n (showing infeasibility), design assurance-based Bayesian study with skeptical prior from existing data, compute posterior P(benefit) across n=40–120 with assurance at each n, identify minimum information sample, run prior sensitivity analysis, generate Bayesian power and assurance curves, and produce a rare disease design report with FDA Bayesian design strategy and go/no-go decision framework.
Rare DiseaseBayesian DesignAssurance
🛒 Retail, Supermarkets & Small Business Intelligence 30 examples · 113–142
113
Customer Loyalty Segmentation & RFM Scoring
A regional supermarket chain has 3 years of loyalty card transaction data and wants to identify which customer segments are most valuable, which are at risk of churning, and how to tailor promotions to each group.
Quyi computes Recency, Frequency, and Monetary (RFM) scores for every customer, bins each dimension into quintiles, assigns composite RFM segment labels (Champions, Loyal, At-Risk, Hibernating, Lost, etc.), calculates the revenue share and average basket size per segment, plots a 3-D RFM scatter and a heatmap of segment sizes, runs K-Means clustering on standardized RFM features to discover data-driven segments (elbow method + silhouette score to choose k), overlays cluster labels on RFM space, and produces a segment report with tailored retention recommendations: win-back email cadence for at-risk, reward tier upgrade for champions, re-engagement voucher for hibernating.
Compute RFM scores and quintile bins for every loyalty card customer. Label canonical segments (Champions, Loyal, At-Risk, Hibernating, Lost). Plot 3-D RFM scatter and segment heatmap. Run K-Means with elbow and silhouette selection. Overlay cluster labels. Report segment revenue share, average basket, and tailored retention action for each group.
LoyaltyRFMSegmentation
114
Market Basket Analysis & Cross-Sell Rules
A small grocery owner wants to know which products are bought together most often so they can redesign shelf layouts, create bundle promotions, and increase the average items-per-basket.
Quyi encodes the transaction log into a binary basket matrix, runs the Apriori algorithm to find frequent itemsets at multiple support thresholds, derives association rules with confidence, lift, conviction, and leverage metrics, filters to high-lift rules (lift > 2.5), visualizes the top rules as a network graph where node size encodes support and edge thickness encodes lift, creates a heatmap of co-occurrence frequency for the top 30 items, suggests specific shelf adjacency changes and bundle deals based on the strongest rules, and estimates the incremental revenue if the top 5 bundle promotions convert at a conservative 15% uptake rate.
Encode transactions as basket matrix. Run Apriori for frequent itemsets; derive association rules with support, confidence, lift, conviction, leverage. Filter lift > 2.5. Plot rules as a network graph and co-occurrence heatmap for top-30 items. Recommend shelf adjacency changes and bundles. Estimate incremental revenue at 15% bundle conversion.
Market BasketAprioriCross-Sell
115
Churn Prediction & Retention Scoring
An independent coffee shop chain has a loyalty app and notices that some registered members stop visiting after a few months. The owner wants a predictive model to identify who is about to churn so staff can proactively reach out with a personalized offer.
Quyi engineers features from visit history (days since last visit, visit frequency trend over 30/60/90 days, average spend, category diversity, redemption rate, gap between consecutive visits), defines churn as no visit in the last 45 days, trains a Gradient Boosting Classifier (XGBoost) with SMOTE oversampling to handle class imbalance, tunes hyperparameters via 5-fold cross-validation optimizing F1-score, evaluates with ROC-AUC, precision-recall curve, and confusion matrix, calculates SHAP values to explain each prediction, ranks all active members by churn probability, and generates a prioritized outreach list with the top SHAP driver for each at-risk customer so staff can personalize the message.
Engineer visit-history features (recency, frequency trends, spend, category diversity, redemption). Define churn = no visit in 45 days. Train XGBoost with SMOTE, tune via 5-fold CV on F1. Report ROC-AUC, precision-recall, confusion matrix. Compute SHAP values per customer. Output a ranked churn-risk list with the top SHAP driver per customer for personalized outreach.
ChurnXGBoostSHAP
116
Dynamic Pricing & Price Elasticity by Category
A supermarket wants to understand how price changes in its fresh produce and dairy categories affect demand, so it can set optimal prices that maximize revenue without losing volume-sensitive customers.
Quyi estimates own-price elasticity for each SKU using log-log OLS regression controlling for promotions, seasonality, day-of-week, and competing SKU prices, identifies elastic (|ε| > 1) vs. inelastic products, estimates cross-price elasticities between substitutable SKUs (e.g., branded vs. private-label yogurt), plots elasticity heatmaps by category and by price tier, simulates revenue and volume outcomes under ±5%, ±10%, ±15% price scenarios for each elastic SKU, applies a price optimization model that maximizes category revenue subject to minimum volume constraints, and produces a pricing playbook with recommended price points, expected revenue lift, and guidance on which SKUs should be loss-leaders to drive foot traffic.
Estimate own-price and cross-price elasticities for each SKU via log-log OLS controlling for promotions, seasonality, weekday, and competitor prices. Plot elasticity heatmap by category and price tier. Simulate revenue and volume at ±5/10/15% price scenarios. Optimize prices to maximize category revenue subject to volume floor. Output pricing playbook with recommended price points and expected revenue lift.
PricingElasticityRevenue Optimization
117
Demand Forecasting & Inventory Optimization
A small supermarket suffers from both overstocking perishables and running out of fast-moving items mid-week. The owner wants accurate 7- and 14-day demand forecasts at the SKU level to automate reorder quantities and reduce waste.
Quyi decomposes each SKU's daily sales into trend, weekly seasonality, and holiday effects using Facebook Prophet, benchmarks against SARIMA and a LightGBM model with lag features, selects the best model per SKU by MASE on a rolling validation window, generates 7- and 14-day point forecasts with 80% and 95% prediction intervals, calculates the optimal reorder point and economic order quantity (EOQ) for each SKU given lead time and holding cost assumptions, flags SKUs where current stock would run out before next delivery, estimates the financial impact of reducing overstock by 20% (spoilage savings), and produces a dashboard-ready CSV with reorder signals and a visual forecast report per category.
Forecast daily SKU demand for 7 and 14 days using Prophet, SARIMA, and LightGBM with lag features; select best per SKU by MASE on rolling validation. Generate forecasts with 80%/95% PI. Calculate EOQ and reorder point per SKU given lead time and holding costs. Flag stockout risk. Estimate spoilage savings from 20% overstock reduction. Output reorder signal CSV and category forecast report.
ForecastingInventoryEOQ
118
Promotion & Discount Effectiveness Analysis
A retail chain runs weekly flyer promotions and percent-off discounts but has no systematic way to measure whether a promotion drives genuine incremental sales or just pulls forward purchases that would have happened anyway.
Quyi applies a difference-in-differences design using non-promoted stores as controls, tests parallel trends assumption in the pre-promotion period, estimates the average treatment effect on the treated (ATT) for each promotion event, decomposes the lift into new customers, increased frequency, and increased basket size, calculates the return on promotion investment (ROPI = incremental gross profit / promotion cost), identifies cannibalization effects on adjacent SKUs, tests whether post-promotion dip ("trough effect") erases short-term lift, plots event-study graphs showing the treatment path ±4 weeks around each promotion, and ranks all promotions by ROPI so the owner can eliminate low-return events and double down on proven winners.
Use difference-in-differences with non-promoted stores as controls to estimate ATT per promotion event. Test parallel trends. Decompose lift into new customers, frequency, basket size. Calculate ROPI = incremental GP / cost. Detect cannibalization and post-promo trough. Plot event-study graphs ±4 weeks. Rank promotions by ROPI and recommend which to cut or expand.
PromotionsDiDROPI
119
Product Assortment Optimization
A small supermarket carries 8,400 SKUs but limited shelf space. The category manager suspects a long tail of slow-moving products is tying up capital and shelf space that could be given to better-performing items or new product introductions.
Quyi calculates gross margin return on inventory investment (GMROI) per SKU, applies the 80/20 Pareto rule to identify the top contributors to revenue and gross profit, segments the assortment into Must-Have, Performance, Opportunity, and Rationalize quadrants based on velocity and margin, identifies "zombie SKUs" — items with fewer than 3 units sold in the last 90 days — calculating their holding cost and the opportunity cost of the shelf space, estimates the revenue uplift potential from delisting bottom-quartile SKUs and replacing with category growth items from trend data, performs a substitutability analysis to assess which delistings would cause customers to buy an alternative vs. go to a competitor, and delivers a rationalization report with delisting candidates, expected SKU count reduction, freed shelf meters, and reallocation recommendations.
Calculate GMROI per SKU. Apply Pareto 80/20 to revenue and GP. Segment assortment into Must-Have, Performance, Opportunity, Rationalize quadrants. Identify zombie SKUs (fewer than 3 units/90 days), calculate holding cost and shelf opportunity cost. Estimate revenue uplift from delisting bottom-quartile. Run substitutability analysis. Deliver rationalization report with delisting candidates, freed shelf meters, and reallocation plan.
AssortmentGMROIRationalization
120
Store Layout & Planogram Heatmap Analysis
A supermarket owner installed a footfall tracking system and now has anonymous path-tracing data for thousands of shopping trips. They want to understand which store zones are underutilized, where customers linger, and whether the current planogram aligns products with foot traffic.
Quyi aggregates path data into a grid-based heatmap of dwell time and zone visit frequency, calculates the "exposure-to-purchase" conversion rate per zone (customers who visited the zone vs. those who bought from it), identifies cold zones — high-traffic areas with low conversion — that may benefit from product repositioning, computes cross-zone transition probabilities to understand natural shopping flow, overlays the heatmap on a store blueprint SVG, tests whether premium shelf positions (eye-level) have higher conversion than floor-level for the same category, runs a chi-square test for independence between traffic zone and basket size, and recommends planogram changes to route shoppers past high-margin impulse categories, increase cold-zone conversion, and reduce congestion at bottlenecks.
Aggregate path-trace data into a dwell-time and visit-frequency heatmap. Calculate exposure-to-purchase conversion per zone. Identify cold zones and hot zones. Compute cross-zone transition probabilities. Overlay heatmap on store blueprint. Test eye-level vs. floor-level conversion via chi-square. Recommend planogram changes to improve high-margin category exposure and reduce bottlenecks.
Store LayoutHeatmapPlanogram
121
Private Label vs. National Brand Margin Analysis
A supermarket group wants to expand its private label range and needs to quantify the margin advantage of own-label products, understand which categories are most susceptible to brand switching, and project the profit impact of a 5-point private label share gain.
Quyi calculates gross margin percentage and gross margin per unit for every SKU, computes average margin by brand tier (national, regional, private label) within each category, runs a logistic regression to identify the driver variables most associated with private-label purchase (price gap, promotion frequency, product type, household income segment, loyalty tier), estimates price gap sensitivity — the price premium customers will pay for national brands before switching, simulates the profit impact of a 5pp private label volume share shift by category, plots a margin waterfall comparing current vs. simulated assortment, and produces a category-by-category playbook identifying which categories have the best conditions for private label expansion (high price gap, low brand loyalty, large volume).
Calculate gross margin % and per-unit margin by brand tier (national, regional, private label) per category. Logistic regression on private-label purchase drivers (price gap, promotions, income segment). Estimate price-premium switching threshold. Simulate profit impact of +5pp private label share shift per category. Plot margin waterfall. Produce category playbook ranking best private label expansion opportunities.
Private LabelMarginCategory Strategy
122
Customer Lifetime Value (CLV) Modeling
A boutique grocery chain wants to move beyond average transaction value and understand the true long-term worth of each customer to make smarter decisions about acquisition spending, loyalty reward tiers, and personalized service investment.
Quyi fits a BG/NBD (Beta-Geometric / Negative Binomial Distribution) model to transaction history to estimate individual purchase frequency and dropout probability, then combines it with a Gamma-Gamma model to predict future monetary value, produces 12-month and 3-year CLV estimates per customer, segments customers into CLV tiers (top 5%, top 20%, middle 50%, bottom 30%), calculates the expected payback period for different acquisition cost scenarios, computes the net present value of each loyalty reward program investment by tier, visualizes CLV distribution with percentile bands, and delivers a business case for differentiated service investment — e.g., dedicated checkout lanes or free delivery for top-5% CLV customers — with expected revenue protection value.
Fit BG/NBD model to transaction history for purchase frequency and churn probability. Combine with Gamma-Gamma model for monetary value. Generate 12-month and 3-year CLV per customer. Segment into CLV tiers (top 5/20/50% + bottom 30%). Calculate acquisition cost payback by tier. Compute NPV of loyalty investment per tier. Visualize CLV distribution. Build business case for differentiated service investment in top-tier customers.
CLVBG/NBDRetention
123
Seasonal Sales Decomposition & Buying Calendar
A small supermarket owner wants to understand exactly how seasonal patterns affect sales by category so they can plan the buying calendar, negotiate better supplier terms during off-peak periods, and avoid the cash-flow crunch that comes from over-ordering at peak season.
Quyi decomposes weekly sales per category into trend, seasonal, and residual components using STL decomposition (Seasonal-Trend by LOESS), extracts the seasonal index for each week of the year per category, identifies peak and trough weeks, calculates the seasonal amplification ratio (peak/trough) and compares it across categories to rank the most volatile, overlays holiday calendar events on the seasonal plots to attribute spikes, plots a 52-week seasonal calendar heatmap per category, estimates the working capital tied up in excess seasonal inventory at peak, and produces a buying calendar recommendation with suggested order-up-to levels by week, negotiation windows with suppliers (troughs), and a cash-flow projection aligned with the seasonal cycle.
Apply STL decomposition to weekly category sales. Extract 52-week seasonal index per category. Identify peak/trough weeks and seasonal amplitude ratio. Overlay holiday events. Plot seasonal calendar heatmap. Estimate working capital tied in peak inventory. Produce buying calendar with order-up-to levels by week, supplier negotiation windows, and cash-flow projection aligned to seasonal cycle.
SeasonalitySTLBuying Calendar
124
Shrinkage & Waste Attribution Analysis
A supermarket manager notices that shrinkage (the gap between expected and actual inventory) is eroding margins but cannot determine how much is due to theft, spoilage, supplier short-deliveries, or scanning errors — each requiring a different operational response.
Quyi reconciles point-of-sale sell-through data with receiving records and physical count audits to calculate unexplained shrinkage per SKU, applies a statistical anomaly detection model (Isolation Forest) to flag SKUs with shrinkage rates that are statistical outliers given their category norms, correlates shrinkage with store zone (front vs. back), product value, packaging type, staff shift, and supplier to identify structural drivers, calculates spoilage loss separately using markdown and write-off records for perishables, estimates the annualized cost of shrinkage by attribution category (theft, spoilage, admin error), plots a Pareto chart of shrinkage cost by product and a time-series of shrinkage rate by shift to identify operational patterns, and delivers a loss-prevention action plan prioritizing the top 3 drivers by financial impact.
Reconcile POS sell-through with receiving records and physical counts to calculate shrinkage per SKU. Apply Isolation Forest to flag outlier shrinkage rates. Correlate with zone, value, packaging, staff shift, and supplier. Separate spoilage loss from markdown records. Estimate annualized cost by category (theft, spoilage, admin error). Plot Pareto by product and time-series by shift. Deliver loss-prevention action plan targeting top-3 cost drivers.
ShrinkageLoss PreventionAnomaly Detection
125
Loyalty Program ROI & Reward Tier Redesign
A mid-size supermarket has run a points-based loyalty program for 4 years and suspects the reward costs are eating into margins without actually driving the incremental purchases they were designed to stimulate.
Quyi performs a matched-pair propensity score matching analysis comparing loyalty members to similar non-members (matched on demographics, basket type, store location), estimates the incremental spend attributable to program membership (removing selection bias), calculates the program cost (points redeemed + administration) vs. incremental revenue generated to derive a true program ROI, segments ROI by reward tier (Silver, Gold, Platinum), identifies the "free-rider" effect — members who would have spent the same without the program — and quantifies their share of total reward cost, runs a simulation of a redesigned tiered structure with higher earn thresholds and experiential rewards to increase incrementality, and delivers a redesign proposal with projected ROI improvement, tier threshold recommendations, and A/B test design for the new structure.
Use propensity score matching to compare loyalty members vs. similar non-members. Estimate incremental spend attributable to program membership. Calculate program ROI (incremental revenue vs. points + admin cost) by reward tier. Quantify free-rider share of reward cost. Simulate redesigned tiered structure with higher earn thresholds and experiential rewards. Deliver redesign proposal with projected ROI improvement and A/B test design.
Loyalty ROIPropensity MatchingProgram Design
126
Social Media Sentiment & Brand Reputation Tracker
A small supermarket chain receives reviews on Google Maps, Facebook, and a local community app. The owner wants to systematically monitor reputation trends, understand which store locations are underperforming on customer experience, and identify which operational issues are most frequently mentioned.
Quyi processes all text reviews through a fine-tuned BERT sentiment classifier to score each review as positive, neutral, or negative with a confidence score, extracts the most frequent complaint and praise themes using LDA topic modeling, maps themes to operational categories (cleanliness, staff friendliness, queue length, freshness, price, parking), calculates a monthly Net Sentiment Score (% positive − % negative) per store location, runs a temporal analysis to detect sentiment trend reversals (sudden drops flagged as alerts), correlates sentiment score with same-store sales growth to quantify the revenue impact of a 10-point sentiment shift, generates a location-level dashboard with sentiment trends, top issue themes, and actionable staff training priorities.
Classify all reviews with BERT sentiment (positive/neutral/negative). Extract complaint and praise themes via LDA; map to operational categories (cleanliness, staff, queues, freshness, price). Calculate monthly Net Sentiment Score per store. Detect trend reversals. Correlate sentiment score with same-store sales growth. Generate location-level dashboard with sentiment trends, top issues, and staff training priorities.
SentimentNLPReputation
127
Supplier Performance & Cost Benchmarking
A supermarket buyer manages 120 active suppliers and wants a data-driven scorecard to identify which suppliers are delivering the best value (on-time, correct quantities, competitive price) and which are creating cost and operational risk through late deliveries and quality failures.
Quyi calculates a weighted supplier scorecard across five dimensions: On-Time In-Full (OTIF) rate, invoice accuracy, price competitiveness vs. market index, defect/returns rate, and lead time variability (coefficient of variation), normalizes each dimension to a 0–100 scale, applies a weighted composite score with customizable weights, segments suppliers into Preferred, Approved, Conditional, and Develop quadrants using a performance vs. strategic importance matrix, identifies suppliers with deteriorating trend (score dropped >10 points in 6 months), calculates the total cost of poor supplier performance (returns, emergency orders, stockouts caused by OTIF failures), plots a radar chart per supplier and a portfolio scatter of performance vs. spend, and delivers a quarterly business review template highlighting top 5 suppliers for promotion to preferred status and bottom 5 for improvement plans.
Score 120 suppliers on OTIF, invoice accuracy, price vs. market index, defect rate, and lead time CV. Normalize to 0–100 and compute weighted composite. Segment into Preferred/Approved/Conditional/Develop quadrants. Flag declining trends (drop >10 pts in 6 months). Calculate total cost of poor performance. Plot radar charts and performance-vs-spend scatter. Produce QBR template with top-5 for promotion and bottom-5 improvement plans.
SupplierScorecardProcurement
128
Checkout Queue & Staffing Optimization
A supermarket manager receives frequent complaints about long checkout queues during peak hours but also has budget constraints that prevent simply adding more cashiers at all times. They need a data-driven staffing model to deploy the right number of staff at the right times.
Quyi models the checkout system as an M/M/c multi-server queue, fits the arrival rate λ by hour-of-day and day-of-week from transaction timestamps, estimates service rate μ per cashier from average transaction duration, calculates queue length, expected wait time, and server utilization for each staffing level scenario, determines the minimum staffing level that keeps average wait below 3 minutes at each time period, computes the staffing cost for the current vs. optimized schedule at hourly wage rates, compares self-checkout lanes (higher μ, lower cost) vs. staffed lanes under the same wait-time constraint, produces an optimal weekly staffing schedule heatmap by day and hour, and quantifies the annual labor savings from the optimized schedule vs. current practice.
Model checkout as M/M/c queue. Fit arrival rate λ by hour and day-of-week from transaction timestamps. Estimate service rate μ from average transaction duration. Calculate wait time and utilization per staffing level. Find minimum staff for avg wait < 3 min per time slot. Compare staffed vs. self-checkout economics. Produce weekly staffing heatmap and quantify annual labor savings vs. current schedule.
QueuingStaffingOperations
129
Email & SMS Campaign Attribution Modeling
A small retail business sends weekly email and bi-weekly SMS campaigns and wants to know which channel, which message type, and which send-time drives the most in-store and online purchases — going beyond open rates to true revenue attribution.
Quyi links campaign send logs to transaction records using customer IDs with a 7-day attribution window, applies multi-touch attribution models (first touch, last touch, linear, time-decay, and Shapley value), compares attribution outcomes across models to identify where they agree and disagree, calculates revenue-per-send, cost-per-acquisition, and return on marketing investment (ROMI) per channel and per campaign type, runs an uplift model (two-model approach) to separate customers persuaded by the campaign from those who would have purchased anyway, segments results by RFM tier (do promotions work better on Champions or At-Risk customers?), identifies the optimal send time by day and hour using a regression on open-to-purchase conversion rate, and delivers a campaign optimization report with channel budget reallocation, optimal send-time calendar, and message-type recommendations per segment.
Link campaign logs to transactions with 7-day attribution window. Apply first-touch, last-touch, linear, time-decay, and Shapley attribution. Calculate revenue-per-send, CPA, ROMI per channel and campaign type. Run uplift model (two-model approach) to estimate persuasion effect. Segment results by RFM tier. Find optimal send time via regression on open-to-purchase conversion. Deliver campaign optimization report with channel budget reallocation and send-time calendar.
AttributionMarketingROMI
130
ABC/XYZ Inventory Classification & Replenishment Policy
A small business owner wants to stop managing 5,000 SKUs with a one-size-fits-all replenishment policy and instead apply differentiated strategies based on both the financial importance (ABC) and demand predictability (XYZ) of each product.
Quyi computes cumulative revenue contribution per SKU (ABC: A=top 80%, B=next 15%, C=last 5%), calculates the coefficient of variation of weekly demand per SKU (XYZ: X=CV<0.5 stable, Y=0.5–1.0 variable, Z=CV>1.0 highly erratic), creates the 3×3 ABC-XYZ matrix with count and revenue share in each cell, assigns a tailored replenishment policy to each matrix cell (e.g., AX=continuous review with tight safety stock, CZ=periodic review with large buffer, AZ=real-time monitoring with manual intervention), calculates safety stock levels under each policy using demand variability and service level targets, estimates the total inventory investment under the new differentiated policy vs. the current uniform policy, and generates a migration plan ranked by expected inventory reduction per effort invested.
Classify all SKUs by ABC (cumulative revenue) and XYZ (demand CV: X<0.5, Y=0.5–1.0, Z>1.0). Build 3×3 ABC-XYZ matrix with count and revenue share per cell. Assign differentiated replenishment policy per cell (continuous/periodic/manual). Calculate safety stock per policy at defined service level. Compare inventory investment under new vs. uniform policy. Generate migration plan ranked by expected inventory reduction per effort.
InventoryABC-XYZReplenishment
131
New Store Location Intelligence & Cannibalization Risk
A supermarket chain is evaluating five candidate sites for a new store and wants to quantify the expected revenue of each site, assess which competitor stores would be displaced, and measure how much the new store would cannibalize sales from its own existing network.
Quyi builds a gravity model using each site's catchment population, drive-time isochrones, competitor store density, and median household income to predict annual revenue, calibrates model parameters using existing stores in similar market conditions, estimates the market share captured from each competitor (Huff model probabilities), calculates self-cannibalization by identifying existing customers in the new store's catchment whose nearest store would change, runs a financial model comparing net new revenue (after cannibalization) to expected site investment and operating costs over 5 years, scores each of the five candidate sites by ROI, payback period, and cannibalization rate, maps catchment areas and competitor presence using a spatial visualization, and delivers a site selection recommendation with a sensitivity analysis on cannibalization assumptions.
Build gravity model (population, drive-time, competitor density, income) to forecast revenue for 5 candidate sites. Calibrate using existing stores. Estimate competitor market share displacement via Huff model. Calculate self-cannibalization for customers whose nearest own-store changes. Build 5-year financial model per site (net new revenue after cannibalization vs. investment). Score sites by ROI, payback, cannibalization rate. Map catchments and deliver site selection recommendation with sensitivity analysis.
Site SelectionGravity ModelCannibalization
132
Customer Satisfaction Driver Analysis (CSAT/NPS)
A small supermarket chain collects monthly CSAT scores across 8 service dimensions (product freshness, price, staff helpfulness, cleanliness, queue speed, product availability, parking, online ordering). Leadership wants to know which dimensions are most responsible for top-box satisfaction and where to invest first for the biggest improvement.
Quyi runs a key driver analysis using Shapley value regression (relative importance of each attribute in explaining overall CSAT variance), builds a penalty-reward asymmetry analysis (Kano-inspired) distinguishing hygiene factors (low score destroys satisfaction) from delight factors (high score disproportionately lifts satisfaction), calculates the importance-performance gap for each dimension to identify "quick wins" (high importance, low current score) vs. "don't over-invest" areas (low importance, high score), runs a driver analysis separately by customer segment (age, CLV tier, visit frequency), plots a 2×2 importance-performance matrix, and delivers a prioritized CX investment roadmap with expected CSAT point improvement per initiative estimated via a response function curve.
Run Shapley value regression to rank drivers of overall CSAT across 8 dimensions. Penalty-reward asymmetry analysis to classify hygiene vs. delight factors. Calculate importance-performance gap. Run driver analysis separately by age segment and CLV tier. Plot 2×2 importance-performance matrix. Deliver prioritized CX investment roadmap with estimated CSAT lift per initiative via response function curve.
CSATNPSDriver Analysis
133
Flash Sale & Limited-Time Offer Optimization
A small online retailer runs flash sales every Friday and wants to decide which products to feature, what discount depth to offer, how long to run the sale, and what traffic channel to use — all to maximize gross profit rather than just GMV.
Quyi analyzes historical flash sale data to estimate the demand uplift function (log-linear) relating discount depth to units sold after controlling for product category, day, and traffic channel, calculates the gross profit-maximizing discount depth for each candidate product given their margin structure, tests whether longer sale duration (6h vs. 12h vs. 24h) shows diminishing returns in total units after the initial burst, applies a portfolio optimization model to select the mix of products for the next flash sale that maximizes total gross profit under a constraint of maximum total discount budget, simulates the expected revenue, units, and GP for the recommended portfolio, computes the optimal send time for the promotion notification based on historical click-to-purchase latency, and delivers a flash sale execution brief with featured products, recommended discount levels, duration, send-time, and projected GP.
Estimate demand uplift function (log-linear) for discount depth controlling for category, day, channel. Calculate GP-maximizing discount per candidate product given margin. Test diminishing returns for 6h/12h/24h durations. Optimize product portfolio mix to maximize total GP under discount budget constraint. Simulate projected revenue, units, GP. Find optimal send time from click-to-purchase latency. Deliver flash sale brief with product mix, discounts, duration, send-time, and projected GP.
Flash SaleDiscount OptimizationE-commerce
134
Employee Scheduling & Labor Cost Analysis
A small supermarket with 40 part-time and full-time staff is overspending on labor, with overtime costs rising 30% year-over-year, while also experiencing understaffing complaints during peak hours. The manager wants a data-driven schedule that balances service levels and labor cost.
Quyi models labor demand as a function of forecasted transaction volume per hour, calculates the required headcount per hour at a target throughput rate, applies an integer linear programming (ILP) model to assign shifts subject to constraints (labor law minimums, employee availability, skill requirements, minimum rest between shifts, maximum weekly hours), minimizes total labor cost while maintaining required headcount coverage, identifies the current schedule's inefficiencies (overtime drivers, idle gaps, unnecessary overlaps), calculates the cost of the optimized schedule vs. current, decomposes the cost gap into overtime elimination, idle-time reduction, and skill-mix optimization, plots a Gantt chart of the optimized weekly schedule, and delivers an implementation guide for the shift change.
Model hourly labor demand from transaction volume forecasts. Apply ILP to assign shifts minimizing total cost subject to coverage requirements, availability, labor law minimums, rest periods, and weekly hour caps. Identify current schedule inefficiencies. Decompose cost gap into overtime elimination, idle reduction, skill-mix. Plot Gantt chart of optimized weekly schedule. Deliver implementation guide with expected annual labor cost saving.
LaborSchedulingILP
135
Competitive Price Monitoring & Response Strategy
A small supermarket owner wants to systematically track competitor prices on key value items (KVIs), understand where they are most price-disadvantaged, and decide where to match prices vs. where their own brand equity makes matching unnecessary.
Quyi ingests scraped competitor price data (or manually provided price surveys), calculates the price index (own price / competitor price) for each KVI per competitor, identifies items where the store is systematically higher (>5% price disadvantage on high-frequency items), segments KVIs into price-sensitive (traffic drivers) and convenience categories using price elasticity estimates, applies a selective price-matching strategy that targets only high-sensitivity items while preserving margin on low-sensitivity convenience items, simulates the total margin cost of different matching thresholds (match all KVIs, match only top-50, match only fresh), calculates the estimated customer perception improvement from price parity on a subset of KVIs using a price perception model, and delivers a competitive pricing playbook with match/hold/lead decisions per KVI and expected margin vs. perception trade-off.
Calculate price index (own/competitor) for all KVIs. Identify items with >5% price disadvantage on high-frequency SKUs. Segment KVIs by price sensitivity using elasticity. Simulate margin cost of matching all KVIs vs. top-50 vs. fresh only. Model customer price perception improvement from selective matching. Deliver competitive pricing playbook with match/hold/lead decisions per KVI and margin vs. perception trade-off.
Competitive PricingKVIStrategy
136
Perishable Waste Reduction via Markdown Optimization
A supermarket loses significant margin on bakery, deli, and fresh produce through end-of-day waste. The category manager wants a markdown model that optimally times and sizes price reductions throughout the day to sell through perishables while maximizing revenue recovery.
Quyi fits a demand-markdown response curve for each perishable category using historical intraday sales data, models the probability of sell-through as a function of remaining time-to-expiry, current inventory level, and markdown depth, formulates a dynamic programming model that chooses the optimal markdown timing and percentage at each decision point (e.g., 4pm, 6pm, 8pm) to maximize expected revenue recovery subject to a waste rate target, compares the dynamic policy to the current fixed-time 30% markdown rule, estimates revenue recovery improvement and waste rate reduction, calculates payback on any technology investment (e.g., electronic shelf labels) required to implement dynamic markdowns at scale, plots sell-through curves and the optimal markdown path per category by time-of-day, and delivers an implementation playbook with decision rules that can be used manually by department staff.
Fit demand-markdown response curves for bakery, deli, and produce using intraday sales data. Model sell-through probability as function of time-to-expiry, inventory, and markdown depth. Formulate dynamic programming model for optimal markdown timing and depth at decision points (4pm, 6pm, 8pm) maximizing expected revenue recovery. Compare to fixed 30% markdown rule. Plot sell-through curves and optimal markdown paths. Deliver manual decision-rule playbook for department staff.
PerishablesMarkdownWaste Reduction
137
New Product Launch Prediction & Cannibalization Model
A small supermarket is deciding whether to list a new organic pasta brand offered by a supplier. The buyer wants to forecast the new product's weekly velocity, predict which existing SKUs it will cannibalize, and assess whether the net category contribution is positive.
Quyi runs an analogous product analysis fitting a Bass diffusion model to similar organic/premium pasta launches in the category, estimates the innovation coefficient (p) and imitation coefficient (q) from comparable SKU histories, generates a 52-week velocity forecast with confidence bands, models cannibalization using cross-price elasticity estimates between the new SKU and existing SKUs in the same category and price tier, calculates net category revenue change (new SKU revenue − cannibalized revenue from existing SKUs), estimates the margin impact accounting for the new SKU's different trade terms, performs a break-even analysis on the minimum velocity required for positive net contribution, and delivers a listing recommendation with conditions (minimum promotional support required, suggested facings, and review milestone at 12 weeks).
Fit Bass diffusion model to analogous SKU launches to estimate 52-week velocity with CI. Model cannibalization via cross-price elasticity with existing category SKUs. Calculate net category revenue (new SKU − cannibalized revenue). Estimate net margin impact with new trade terms. Break-even on minimum velocity for positive contribution. Deliver listing recommendation with minimum promotional support, facings, and 12-week review milestone.
New ProductBass ModelCannibalization
138
Delivery Route & Last-Mile Cost Optimization
A small grocery business has started offering home delivery and is losing money on it due to inefficient routes and high per-order delivery cost. The owner wants to optimize delivery routes, identify minimum order values for profitability, and understand which zones are structurally unprofitable.
Quyi solves the Capacitated Vehicle Routing Problem (CVRP) for the existing delivery order set using the OR-Tools solver with time window constraints, compares the optimized routes to current driver paths and calculates distance saved and time saved per route, calculates the fully loaded cost per delivery (fuel, driver time, vehicle depreciation) under optimized routing, estimates break-even minimum order value per delivery zone given route density, identifies low-density zones where the cost-per-delivery structurally exceeds any reasonable minimum order value, models the impact of delivery-day batching (consolidating Wednesday/Thursday into one day) on route efficiency, plots an interactive map of optimized routes color-coded by profitability zone, and delivers a delivery strategy report with minimum order value recommendations by zone, batching schedule, and zones to consider converting to click-and-collect only.
Solve CVRP with OR-Tools for existing delivery orders with time window constraints. Calculate distance and time saved vs. current routes. Compute fully loaded cost per delivery under optimized routing. Estimate break-even minimum order by zone. Identify structurally unprofitable low-density zones. Model day-batching impact on route efficiency. Plot route map colored by profitability zone. Deliver strategy report with minimum order recommendations, batching schedule, and click-and-collect candidates.
RoutingLast-MileCVRP
139
A/B Testing Loyalty App Feature Impact
A supermarket chain launched a "personalized offer" feature in its loyalty app for a random 50% of members and wants to rigorously measure whether the feature caused an increase in visit frequency, basket size, and total spend over a 6-week test period.
Quyi validates the A/B randomization by checking pre-test balance on key covariates (CUPED pre-test variance reduction), calculates the average treatment effect (ATE) on visit frequency, basket size, and total spend using t-tests and Mann-Whitney U for non-normal distributions, applies CUPED (Controlled-experiment Using Pre-Experiment Data) to reduce variance and increase sensitivity, checks for novelty effects by examining treatment effect decay over the 6 weeks, calculates minimum detectable effect vs. actual observed effect to assess test statistical power, tests for heterogeneous treatment effects (HTE) by segment (CLV tier, age, visit frequency quartile), estimates the annualized revenue impact of a full rollout, plots cumulative treatment effect over time and HTE bar charts, and delivers a feature release recommendation with the business case quantified in annual incremental revenue.
Validate A/B randomization with pre-test covariate balance check. Calculate ATE on visit frequency, basket size, spend using t-test and Mann-Whitney U. Apply CUPED variance reduction. Test for novelty effect over 6 weeks. Check statistical power vs. observed effect. Run HTE analysis by CLV tier, age, visit frequency quartile. Estimate annualized revenue of full rollout. Deliver feature release recommendation with business case in incremental annual revenue.
A/B TestingCUPEDHTE
140
Cash Flow Forecasting & Working Capital Management
A small supermarket owner is often surprised by cash shortfalls despite the business being profitable on paper. They want a 13-week cash flow forecast that integrates sales seasonality, supplier payment terms, rent, and payroll cycles so they can proactively manage working capital.
Quyi builds a 13-week rolling cash flow model integrating weekly sales forecasts (from seasonal decomposition), timing of cash receipts (proportion same-day cash vs. 7-day card settlement), supplier payment schedules (COD vs. 30-day vs. 60-day terms), weekly payroll cycles, rent and utility payment dates, and tax payment obligations, calculates the cash conversion cycle (DIO + DSO − DPO) and identifies which working capital lever has the highest improvement potential, runs a Monte Carlo simulation (1,000 scenarios) varying sales by ±15% around forecast to generate a cash balance distribution at each week, identifies weeks with >5% probability of negative cash balance, calculates the optimal minimum cash buffer under the 95th percentile scenario, recommends specific working capital actions (extend supplier terms on low-velocity SKUs, accelerate card settlement, reduce slow-moving inventory), and delivers a dashboard-style cash flow report with traffic light alerts on risk weeks.
Build 13-week rolling cash flow integrating seasonal sales forecast, card settlement timing, supplier payment schedules (COD/30d/60d), payroll, rent, utilities, tax obligations. Calculate cash conversion cycle and identify highest-impact working capital lever. Monte Carlo simulation (1,000 scenarios ±15% sales) for weekly cash balance distribution. Flag weeks with >5% negative cash probability. Recommend working capital actions. Deliver cash flow dashboard with traffic-light alerts.
Cash FlowWorking CapitalMonte Carlo
141
Franchise & Multi-Location Performance Benchmarking
A small supermarket franchisor has 22 franchise locations and suspects there are large performance gaps between top and bottom performers that, if closed, could significantly lift total network revenue. They want a systematic benchmarking model to identify best practices from top stores.
Quyi applies Data Envelopment Analysis (DEA) to calculate efficiency scores for each location using inputs (square footage, headcount, rent cost, marketing spend) and outputs (revenue, gross profit, customer satisfaction score), identifies the frontier locations (efficiency = 1.0) and calculates the input slack (wasted resources) and output shortfall for each inefficient store, constructs a peer benchmark group for each underperformer (the efficient stores most similar in size and market type), extracts the operational practices that differentiate top-quartile from bottom-quartile stores using a random forest feature importance analysis on operational metrics (waste rate, OTIF, staff turnover, average transaction time, promo compliance), quantifies the revenue uplift if bottom-quartile stores reach median efficiency, and delivers a franchise improvement plan per underperformer with specific peer benchmarks and operational targets.
Apply DEA to 22 franchise locations (inputs: sq ft, headcount, rent, marketing; outputs: revenue, GP, CSAT). Identify efficient frontier locations and calculate input slack and output shortfall per store. Construct peer benchmarks per underperformer. Random forest feature importance on operational metrics to identify top-quartile differentiators. Quantify revenue uplift if bottom-quartile reaches median efficiency. Deliver franchise improvement plan per underperformer with operational targets.
DEAFranchisingBenchmarking
142
End-to-End Retail P&L Intelligence Dashboard
A small supermarket owner currently manages the business using spreadsheets and monthly accountant reports, making it impossible to catch problems in real time. They want a fully automated Quyi-generated P&L intelligence report that consolidates all key financial and operational KPIs into a single weekly decision-support document.
Quyi ingests the weekly sales extract, cost-of-goods data, payroll summary, and expense log, calculates gross margin by category, operating expenses as a % of revenue, EBITDA, and net margin for the week vs. prior week vs. same week last year, computes 6 operational KPIs (shrinkage rate, waste %, OTIF, average basket size, transactions per labor hour, customer count), applies statistical process control (SPC) charts with control limits to flag any KPI that has gone out of control (beyond ±2σ from the rolling mean), highlights the top 3 categories with margin deterioration and the top 3 with margin improvement, summarizes the week's performance in a plain-English executive paragraph (generated by the Analyst Agent), and produces a complete PDF-ready weekly management report including financial summary table, 6 SPC charts, category margin waterfall, and narrative — formatted for review by a non-technical owner.
Ingest weekly sales, COGS, payroll, and expenses. Calculate gross margin by category, OpEx %, EBITDA, and net margin vs. prior week and same week LY. Compute 6 operational KPIs (shrinkage, waste %, OTIF, avg basket, transactions/labor-hour, customer count). Apply SPC charts with ±2σ control limits; flag out-of-control KPIs. Highlight top-3 margin improvers and detractors. Generate plain-English executive summary. Produce PDF-ready weekly management report with financial table, SPC charts, margin waterfall, and narrative.
P&LSPCManagement Report
🎓 Education & Institutional Operations Featured Example · 143
★ FEATURED
#143
Academic Timetable & Resource Scheduling System — From Small School to Large University

A complete constraint-satisfaction and optimization pipeline that generates conflict-free academic schedules for any institution — from a 15-classroom primary school to a 400-room research university with 1,200 course sections, 300 instructors, and 18,000 students — while respecting hard constraints (room capacity, instructor availability, accreditation requirements) and soft preferences (back-to-back teaching blocks, student travel time between buildings, instructor preferred times), and automatically propagating the national holiday calendar and exam blackout periods.

PHASE 1 — DATA INGESTION & CONSTRAINT ENCODING
Institution Profile & Input Parsing

Quyi ingests the institution's master data across six tables: Rooms (room ID, building, floor, capacity, type: lecture hall / seminar / lab / studio / gym / online, AV equipment, accessibility features, department ownership, hourly setup cost); Instructors (ID, name, department, rank, contracted hours/week, specializations, hard unavailabilities — sabbatical, medical, external commitment — soft preferences: preferred start time, max consecutive hours, preferred days off, building preference); Courses (code, title, credits, weekly contact hours, session split preference e.g. 2×90min vs. 3×60min, enrollment cap, room type required, co-requisites, prerequisite chain, department ownership, accreditation hours requirement); Students / Groups (section rosters or year-group enrollment counts per program for universities using cohort scheduling); Calendar (semester start/end, national holidays, institutional holidays, reading weeks, exam period blackout, sport event blackouts); Buildings & Travel Matrix (walking minutes between every building pair — penalizes schedules that require students or instructors to travel more than 10 minutes between consecutive sessions).

PHASE 2 — CONSTRAINT CLASSIFICATION
Hard vs. Soft Constraint Matrix

Hard constraints (violations = infeasible):

  • No instructor teaches two sessions simultaneously
  • No room hosts two sessions simultaneously
  • Room capacity ≥ course enrollment (with 5% buffer)
  • Room type matches course requirement (labs → labs only)
  • Session does not fall on a blackout day (holiday/exam)
  • Instructor unavailability windows respected absolutely
  • Co-requisite courses cannot overlap in time for same cohort
  • Minimum accreditation contact hours met per course
  • Sessions fit within institutional operating hours (07:00–22:00)

Soft constraints (violations = penalty points):

  • Instructor max 4 consecutive teaching hours without break (penalty × overrun)
  • Instructor preferred day-off respected (penalty 20 pts)
  • No instructor starts before 08:00 unless consented (penalty 30 pts)
  • Student travel time < 10 min between consecutive sessions (penalty × min overage)
  • Spread course sessions across week (Mon+Wed preferred over Mon+Tue for 2×/week)
  • Room utilization between 70–90% (under-utilization and over-booking both penalized)
  • Large lectures scheduled in the morning peak (avoid afternoon energy dip)
  • Lab sessions not scheduled on Friday afternoon (low attendance empirical penalty)
  • Part-time instructor sessions clustered into 2–3 days (travel cost minimization)
PHASE 3 — SOLVER ENGINE
Two-Stage Optimization Architecture

Stage A — Constraint Programming (CP-SAT via OR-Tools): Models the schedule as a CSP where each session is assigned a (timeslot, room, instructor) triple. CP-SAT proves feasibility and finds an initial conflict-free assignment using branch-and-bound with constraint propagation. For large instances (>500 sessions), the problem is decomposed by department and solved in blocks with shared-resource constraints enforced at the boundary. The solver runs with a 120-second time limit per department block and reports any unsatisfiable constraints with the minimal infeasible subset (MIS) for human review.

Stage B — Simulated Annealing Fine-Tuning: Starting from the CP-SAT feasible solution, a Simulated Annealing metaheuristic minimizes the total soft-constraint penalty score by making local moves (swap two sessions' timeslots, reassign a room, shift a session ±1 slot) with acceptance probability exp(−Δpenalty/T) where T cools geometrically over 50,000 iterations. Tracks the Pareto front of (hard-feasibility, soft-penalty) to ensure hard constraints are never re-violated during fine-tuning. Final schedule is guaranteed hard-feasible and locally optimal in soft penalties.

PHASE 4 — CONFLICT DETECTION & REPORTING
Conflict Matrix & Audit Trail

After the solver completes, Quyi runs a full constraint audit that produces: a Conflict Matrix — a cross-tabulated table of all (course × timeslot) pairs showing any residual soft violations color-coded by severity; a Room Utilization Report — capacity fill rate per room, identifying both chronically under-used rooms (candidates for consolidation) and over-booked rooms where enrollment growth will cause future capacity problems; an Instructor Load Report — total scheduled hours per instructor vs. contracted hours, overtime flags, days-worked count, consecutive-hour violations, and a fairness Gini coefficient across the faculty measuring equitability of load distribution; a Student Conflict Report — for each student group or individual student (university mode), any schedule where two enrolled courses overlap, with the offending pair and a suggested resolution (alternate section, async option); and an Accreditation Compliance Report — confirming minimum contact hours per course meet regulatory requirements for each program, with a sign-off table formatted for submission to accreditation bodies.

PHASE 5 — CALENDAR INTEGRATION & HOLIDAY PROPAGATION
Dynamic Academic Calendar Engine

Quyi loads the national holiday calendar (configurable per country — US, Mexico, Spain, France, Brazil, UK, Colombia, Argentina supported out of the box), institutional holidays (founder's day, graduation week, sports days), reading weeks, mid-term exam windows, and final exam blackout periods into a unified DateGrid that marks every calendar slot as: Available, Soft-Avoid (low-priority teaching), Holiday (hard block), Exam (hard block), or Makeup (compensatory session slot).

For sessions that fall on a holiday that was not anticipated (late government declaration), the engine automatically: (1) flags affected sessions, (2) finds the nearest Makeup slot in the same week or the following week, (3) checks that the instructor, room, and student group are all available on the makeup slot, (4) proposes the reassignment for human approval via the conflict report.

Generates a full-semester session count per course confirming that after all holidays and makeup sessions, each course meets its minimum contact hours — a critical requirement for accreditation and for institutions operating on a compressed semester calendar.

PHASE 6 — SCENARIO ANALYSIS & WHAT-IF SIMULATION
Decision Support for Administrators

Quyi supports interactive scenario analysis so administrators can answer operational questions without re-running the full solver: +20% Enrollment: which rooms become capacity bottlenecks? Which courses need a second section? Instructor Absence: if Prof. García is on sick leave for 3 weeks, which courses need reassignment and to whom (by specialization match)? Room Closure: if Building C is closed for renovation for 6 weeks, what is the cascading schedule impact and which courses require relocation? New Course Addition: given a new elective with 45 students requires a lab on MWF, what is the earliest available conflict-free slot? Hybrid Mode Transition: if 30% of lectures move online, which physical rooms can be released and what is the space cost saving?

Each scenario is solved incrementally (fixing unchanged assignments, re-solving only affected variables) in under 10 seconds for mid-size instances, giving administrators a real-time decision-support tool rather than a batch overnight process.

PHASE 7 — OUTPUT FORMATS & EXPORTS
Publication-Ready Schedules for Every Stakeholder

For students: personalized weekly timetable PDF per student / section, showing course name, room number with building map link, instructor name, and session type (lecture / lab / tutorial / seminar). Color-coded by subject area. Includes travel-time warnings for back-to-back sessions in different buildings.

For instructors: personal teaching schedule with room details, total weekly hours, office hours slot suggestions (fills instructor's non-teaching availability), and a semester-long session calendar in iCal format (.ics) for Google Calendar / Outlook import.

For room managers: room-by-room booking grid for each week of the semester, showing occupancy %, peak hours, and cleaning/setup windows. Exportable as Excel for facilities management systems.

For registrars / accreditation: full course contact-hour compliance table, instructor load summary, and a structured JSON/CSV master timetable compatible with SIS systems (Banner, Ellucian, Blackboard, Moodle). Includes a LaTeX-formatted timetable grid for official academic catalogs.

PHASE 8 — SCALE & INSTITUTION PROFILES
Validated Across Institution Sizes
Institution Type Rooms Instructors Sections Students Solve Time
Primary School (K–6) 12–20 15–30 30–60 300–600 < 5 sec
Middle / High School 20–50 30–80 80–200 600–2,000 < 30 sec
Community College 40–100 60–150 150–400 2,000–8,000 < 2 min
Mid-size University 100–200 150–300 400–800 8,000–18,000 < 8 min
Large Research University 300–600 300–600 800–2,000 18,000–60,000 < 25 min
SPECIAL MODULE A — EXAM SCHEDULING

Quyi generates the exam timetable as a separate optimization problem, ensuring: (1) no student sits two exams on the same day (hard constraint); (2) no student sits three or more exams in consecutive days (soft constraint, penalty scales with number of consecutive exam days); (3) exams for large courses (enrollment > 200) are scheduled in the first half of the exam period to allow adequate marking time before grade submission deadlines; (4) room assignments for exams use a dispersed seating model (1 student per 2 seats for integrity) and compute the number of rooms and invigilators required per exam; (5) clash-free exam distance is maximized — pairs of courses with the highest enrollment overlap are scheduled as far apart as possible in the exam period using a graph coloring heuristic (courses sharing students = connected nodes, exam slots = colors); (6) an Exam Clash Report is produced listing every student with unavoidable clashes (when the problem is over-constrained) ranked by severity (same-day, same-half-day) with resolutions offered (deferred sitting, special arrangement).

SPECIAL MODULE B — SUBSTITUTION & ABSENCE MANAGEMENT

When an instructor reports an unplanned absence, Quyi's substitution engine identifies qualified substitutes (matching specialization tags), checks their availability in the affected timeslot, ranks candidates by current load balance (least-loaded first), checks room validity, and generates a substitution proposal in under 3 seconds. For planned absences (conference travel, sabbatical), the system offers three resolution strategies: Redistribution — sessions covered by a qualified colleague with load compensation; Makeup scheduling — sessions moved to the instructor's nearest available slot; Asynchronous replacement — session replaced by recorded content with a TA-facilitated discussion section. Tracks the cumulative contact-hour balance per course throughout the semester and alerts the registrar if a course falls below the accreditation minimum with 3 weeks remaining.

SPECIAL MODULE C — MULTI-SEMESTER & CURRICULUM SEQUENCING

For universities with structured degree programs, Quyi models the full prerequisite dependency graph across all courses (DAG — directed acyclic graph) to validate that the multi-semester schedule allows every student to complete their degree in the standard time. Detects curriculum bottlenecks — courses that must be passed before a large number of downstream courses and are offered only once per year — and flags them for priority scheduling (best room, best timeslot, with a waitlist overflow section planned). Calculates the theoretical minimum time to degree under current offering patterns vs. optimal offering, identifying cases where a single course offered only in fall forces a semester delay for students who fail it. Generates a curriculum flow map visualization (prerequisite DAG with enrollment volumes) and a degree-audit simulation showing the distribution of expected graduation times across the enrolled student population under the proposed timetable.

EXAMPLE PROMPT
"Generate a full-semester conflict-free timetable for a mid-size university with 180 rooms (lecture halls, seminar rooms, labs, studios), 220 instructors (full-time and part-time), 620 course sections across 8 faculties, and 14,000 students organized in cohort groups. Encode all hard constraints (room capacity with 5% buffer, room type matching, no double-booking, instructor unavailabilities, national holiday calendar for Mexico, exam blackout weeks) and soft constraints (max 4 consecutive teaching hours, instructor preferred days, <10-min travel between consecutive sessions for students, room utilization 70–90%, morning slots for large lectures). Use CP-SAT for feasibility then Simulated Annealing for soft penalty minimization. Generate: conflict matrix, room utilization heatmap, instructor load fairness report (Gini coefficient), accreditation contact-hour compliance table, student conflict report, exam timetable with graph coloring (clash-free distance maximized), substitution recommendations for 3 planned absences, and curriculum bottleneck analysis. Export schedules as personalized student PDFs, instructor iCal files, room booking Excel, and SIS-compatible JSON. Run scenario: what happens if enrollment in Engineering grows 25% next semester — which rooms become capacity bottlenecks and which courses need a second section?"
EXPECTED OUTPUTS
Conflict-Free Master Timetable Room Utilization Heatmap Instructor Load Fairness (Gini) Accreditation Compliance Table Student Conflict Report Exam Timetable (Graph Coloring) Holiday Propagation Calendar Substitution Plan (3 absences) Curriculum Bottleneck DAG Degree-Audit Simulation +25% Enrollment Scenario Student PDF Timetables Instructor iCal (.ics) Files Room Booking Excel SIS-Compatible JSON/CSV

Getting the most
out of Quyi.

Small habits that significantly improve the quality and speed of your analyses.

🎯
Be Specific with Goals
State the statistical method, output format, and acceptance criteria you expect. "Analyze sales" gives a generic result. "Forecast 12-week sales by region using ARIMA, report MAPE ≤ 15%, generate weekly charts with confidence intervals" gives a precise, reproducible one.
📎
Upload Context First
Upload your methodology papers, domain guides, or regulatory standards to the Memory Bank before running analysis. The agents will anchor their reasoning to your specific context rather than general knowledge — producing domain-appropriate methods and language.
🔖
Curate Citations Early
Use the Citations panel to search for and save relevant papers before executing. Saved citations become methodological anchors in the final report, grounding conclusions in the literature you've curated rather than generic references.
💾
Work Within Projects
Always work inside a named project. Memory compounds across sessions — the third analysis benefits from everything learned in the first two. Projects that build on themselves produce better results than repeated fresh sessions on the same topic.
🖼️
Include Your Figures
Upload existing plots and scientific images from prior work. Quyi reads charts and microscopy images as fluently as tables. Ask it to interpret, compare, or integrate figures directly into the analysis — the image and the data become one body of evidence.
💬
Follow Up in Chat
Results are the start, not the end. Use Chat to ask clarifying questions, explore unexpected findings, request breakdowns by subgroup, or ask Quyi to explain its methodology in plain language. Every question builds on the full prior context of the session.