3.2 KiB
powershell_example
This example demonstrates core programming principles that apply regardless of language — Excel, PowerShell, or R:
- One job per script — each script does exactly one thing
- Configuration over hardcoding — constants like exchange rates live in
.env, not buried in code - Immutable inputs — raw data is never modified; the pipeline can always be rerun from scratch
- Fail fast — validation runs early and stops the pipeline with a clear message before bad data spreads
- Separation of concerns — scripts don't know or care what runs before or after them
- Orchestration — a single caller (
main.sh) owns the sequence and can be scheduled via cron
Project structure
powershell_example/
├── .env ← exchange rate and future config
├── main.sh ← pipeline caller, runs all steps in order
├── data/
│ ├── raw/ ← original source, never modified
│ ├── interim/ ← transformed working files (steps 03–06)
│ ├── processed/ ← calculated output (step 07)
│ └── formatted/ ← presentation-ready, rounded (step 08)
└── scripts/
├── 00_paths.R ← paths + config, sourced by all scripts
├── 01_create_data.R ← creates wide CSVs → raw/
├── 02_validate.R ← checks column counts, stops on failure
├── 03_convert_currency.R ← EUR to USD, stays wide → interim/
├── 04_pivot_income.R ← wide to long → interim/
├── 05_convert_units.R ← thousands to persons, pivot pop to long → interim/
├── 06_merge.R ← join income + population → interim/
├── 07_calc.R ← income per person → processed/
└── 08_format.R ← round to 2 decimals → formatted/
A note on what to commit
This repo commits everything for illustration purposes. In a real project you would typically exclude:
.env— may contain API keys, credentials, or proprietary constantsdata/— raw and processed data files are often too large for git and may contain proprietary or personally identifiable information
Both would normally be listed in .gitignore.
Usage
bash /data/projects/r/powershell_example/main.sh
Scheduling with cron
Cron is the Linux/Mac equivalent of Windows Task Scheduler — it runs a program automatically on a schedule with no human intervention.
To run automatically every Monday at 8am:
0 8 * * 1 /data/projects/r/powershell_example/main.sh >> /tmp/pipeline.log 2>&1
A note on corporate environments: IT departments are often protective of who can schedule automated jobs on shared servers — and for good reason. Silent background processes can consume resources, touch shared databases, or trigger emails without anyone knowing they exist. On your own machine, Task Scheduler is fair game. On a company server, the right move is to document what the job does, show IT, and ask them to schedule it officially. That conversation also creates a paper trail, which matters in regulated industries.