Today I will guide you through a class refactoring done in one of the projects I work on. Ideally we should be doing this test driven, but - as this is for fun - I won’t today.
This is a simple interface around the clamscan scanner. This code has several smells. 😷
Turning off the virus scanner is done using a global setting
Several external dependancies make it more difficult to port
Our primary goal is to keep backward compatibility with the rest of the code base, while cleaning up the code.
#1 Separate the generic VirusScanner from the specific clamscan.
By doing this separation we can focus on the the VirusScanner without needing to worry about clamscan.
Cool. 😎
#2 Removing the “Settings” for the virus scanner
I want to get rid of the app specific ‘virusscan_off’ setting. Honestly I’m ashamed I ever wrote that 😳
The first step is to make the behavior of the class configurable.
We created an easy way to understand the interface for VirusScanner’s scanner. It’s now also very easy to extract the ‘virusscan_off’ setting.
This works! However! This breaks one of our earlier requirements: “keep backward compatibility”. This solution also requires the developer to know what the default scanner is, and that a new smell; a smell I don’t like. 😡
Let’s do better!
Ok. I’m pretty happy with that. 😌
Now we turn to the ClamScan class.
#3 Remove external dependancies
Cocaine has a lot of powerful features, has tones of great use cases, and I had a phase where I was totally addicted to it. 😇
Here Cocaine is only used to safely use a file name in a shell command. 🐌 Shellwords - a standard lib - can deal with this easily.
Doesn’t that look a lot better? We did quite a bunch of changes here.
Comparing the output of parse_clamscan_output to a string is not very descriptive, and it’s brittle. It’s better to just have a method that changes the output to the direct answer we are interested in.
Skip unnecessary cast from Array to String and use Array#empty?
Only one more line to go. There is still a train wreck in the clamscan_result_safe? method.
That’s a lot easier to understand. 🎓
This improvement also hints at an other improvement I could add: By making sure the scanned_file actually matches the @file_path given, we would be even more sure about the behavior of this class.
Final thoughts
There is still room for improvement here.
Better handling of the clamscan binary, and PATH
Extracting the clamscan output processing to it’s own object?
Does it make sense that calling a scanner returns false when infected?
You see something I missed? 🔎
Let me know if you like more posts like this. Also, let me know if you are interested to seem me do this same refactoring using TDD. Or, if you have a piece of code you’d like me to refactor; send it to me! 📩