Here’s the Rakefile we’ve been working on for the last few episodes. It finds Markdown source files in a sources subdirectory of a project, and produces a parallel hierarchy of HTML files in an outputs subdirectory.
SOURCE_FILES = Rake::FileList.new("sources/**/*.md", "sources/**/*.markdown") do |fl| fl.exclude("**/~*") fl.exclude(/^scratch\//) fl.exclude do |f| `git ls-files #{f}`.empty? end end task :default => :html task :html => SOURCE_FILES.pathmap("%{^sources/,outputs/}X.html") rule ".html" => ->(f){source_for_html(f)} do |t| sh "pandoc -o #{t.name} #{t.source}" end def source_for_html(html_file) SOURCE_FILES.detect{|f| f.ext('') == html_file.ext('')} end
Now that we are recreating the input file hierarchy in an outputs directory, we need to ensure that the destination directory exists before generating any HTML files. An easy way to do this in Rake is to use a directory task. This is like a file task, except for a directory. But unlike a file task, we don’t have to supply any code for how to make the directory appear if it doesn’t already exist. Simply by specifying the task, we are giving Rake implicit instructions to create the directory if it is needed.
We add this directory to the list of dependencies for the .html rule.
rule ".html" => [->(f){source_for_html(f)}, "outputs"] do |t| sh "pandoc -o #{t.name} #{t.source}" end
Now when we run rake, we can see that it creates the directory before beginning to generate the HTML files. Unfortunately, it runs into a problem as it tries to build the appendix.html file. Since this file is in a subdirectory of the sources directory, we want the HTML output file to be in a corresponding subdirectory of the outputs directory. But this subdirectory doesn’t yet exist.
t$ rake mkdir -p outputs pandoc -o outputs/backmatter/appendix.html sources/backmatter/appendix.md pandoc: outputs/backmatter/appendix.html: openFile: does not exist (No such file or directory) rake aborted! Command failed with status (1): [pandoc -o outputs/backmatter/appendix.html...] /home/avdi/Dropbox/rubytapas/133-rake-file-operations/project/Rakefile:16:in `block in <top (required)>' Tasks: TOP => default => html => outputs/backmatter/appendix.html (See full trace by running task with --trace)
To ensure this or any other intermediate directory exists before producing an HTML file, we could execute a mkdir -p shell command, using #pathmap to pass just the directory portion of the target filename.
sh "mkdir -p #{t.name.pathmap('%d')}"
But Rake gives us a shortcut for this. Instead of running a shell command, we can use a mkdir_p method right in the task:
SOURCE_FILES = Rake::FileList.new("sources/**/*.md", "sources/**/*.markdown") do |fl| fl.exclude("**/~*") fl.exclude(/^sources\/scratch\//) fl.exclude do |f| `git ls-files #{f}`.empty? end end task :default => :html task :html => SOURCE_FILES.pathmap("%{^sources/,outputs/}X.html") directory "outputs" rule ".html" => [->(f){source_for_html(f)}, "outputs"] do |t| mkdir_p t.name.pathmap("%d") sh "pandoc -o #{t.name} #{t.source}" end def source_for_html(html_file) SOURCE_FILES.detect{|f| f.ext('') == html_file.pathmap("%{^outputs/,sources/}X") } end
Now when we run rake, it ensures the target directory exists before each markdown-to-HTML transformation.
$ rake mkdir -p outputs/backmatter pandoc -o outputs/backmatter/appendix.html sources/backmatter/appendix.md mkdir -p outputs pandoc -o outputs/ch1.html sources/ch1.md mkdir -p outputs pandoc -o outputs/ch2.html sources/ch2.md mkdir -p outputs pandoc -o outputs/ch3.html sources/ch3.md mkdir -p outputs pandoc -o outputs/ch4.html sources/ch4.markdown
Often when writing build scripts it’s convenient to have an easy way to quickly blow away all of the generated files. Let’s add a task to handle this. Once again, instead of running a shell, we’ll use a Rake helper method called rm_rf. This mirrors the shell rm -rf command, which recursively deletes files and directories without any warnings or confirmation.
task :clean do rm_rf "outputs" end
Thanks for the topic! I sent some money to the fund. I am glad people keep promoting it. I wish I was half the decent man Jim was.
The Jim Weirich https links don’t work. Seems they need to be changed to http. Parts 1-4 of your rake series are similarly effected, and I assume any other pages referencing that site are as well.