Archive for the ‘emacs’ Category

Configuring LeafNode on OSX

Tuesday, May 18th, 2010

Getting LeafNode to work on OSX is a bit of a pain, messing with that today I found that little exists in terms of documentation on how to make this happen. The real problem isn’t specifically with the base install of LeafNode, but getting it configured is where the problem is at:

Installing Leafnode:
port install leafnode

Once leafnode is installed, you have a few commands that you can use at this point:
fetchnews – This will be used to get the listing of groups, which once read, will start pulling the actual documents themselves
leafnode – This can be run a few different ways, but with the macports option it’s really not a bad idea to run this through launchctl (assuming you’re on a later version of OSX).

Once it’s installed, your configuration file that you’ll need to edit is in /opt/local/etc/leafnode/config. Most of the stuff in the config.example will help in configuring leafnode. The important elements that should be edited are:
server – The server name itself
username – Obvious
password – Again, obvious
expire – I personally wanted everything that a newsgroup had, and to continue to fetch as much as possible since I want to be able to search off archived messages. I set this really high, I’d recommend the same.
hostname – This can be configured from your newsreader, but I added this as something if I forgot to configure it correctly.
maxfetch – Similar to expire, just sets the max number of initial articles to fetch. I set this insanely high as well.

The plists for launchctl are located in /opt/local/etc/LaunchDaemons/org.macports.leafnode
One way to load this right away is:
sudo launchctl load /opt/local/etc/LaunchDaemons/org.macports.leafnode/org.macports.leafnode.plist

I’ve also created a symlink from the above location to /Library/LaunchDaemons
cd /Library/LaunchDaemons
ln -s /opt/local/etc/LaunchDaemons/org.macports.leafnode/org.macports.leafnode.plist .

At this point, you can use emacs to connect to your server. Much of the stuff left is to configure your emacs variables. Some example stuff I use is:

;; GNUS Setup
(setq user-mail-address “youremail@domain”)
(setq user-full-name “Your Name”)
(setq gnus-select-method ‘(nntp “localhost”
(nntp-address “127.0.0.1″)
(nntp-authinfo-file “~/.authinfo”)
(nntp-port-number 119)))

(setq gnus-summary-make-false-root ‘dummy)
(setq gnus-build-sparse-threads ‘some)
(setq gnus-fetch-old-headers ‘some)

(setq gnus-posting-styles
‘((“.
(signature-file “~/.signature”)
(x-url “http://www.thedarktrumpet.com/”)
(organization “The Dark Trumpet”))))

I do have a ~/.authinfo file, but the contents aren’t that important for leafnode.

In your news reader, or in GNUS, you can get the full newsgroup listing after running the initial fetchnews. Subscribe and view each newsgroup you want, then rerun the fetchmail again after. I use:
“fetchmail -v -n” to accomplish this.

The last thing that really needs to be done is setting up your crontab. I used the root crontab to do this, using “crontab -e” as root, then I have the following line:
/60 /opt/local/sbin/fetchnews -v -n

Parsing dirty data in Common Lisp

Saturday, March 27th, 2010

I came across a bit of an issue today when I was building a parser in Common Lisp. Basically when I saved my archive folder in Outlook to a giant text file, I would like to parse through each individual email (there are over 8000), and save them to an individual file, on the hard disk. Later I would integrate this with an option with fetchmail to pull in those emails as well into the same folder structure which would then be indexed by DevonThink as an external folder. A bit of a long story, and I plan to write more about it – but for now, what about trying to parse dirty data in common lisp?

There are a few options about how to clean up dirty data. I came across two differnet options with what I was doing:

1. Just pull the readable data – using NIL for the other bits of data
The advantage of doing this is that it’s really fast. You just make a little different call to (with-open-file) and you effectively skip the data that’s not readable. There are a lot of disadvantages to this approach, mainly your data isn’t going to be near like you may have wanted it to be originally. Bullet points, for example, could be translated to a -. This method, though, will make it NIL, or empty. For my case this was OK, I didn’t really care about the translation of this bit of data – I was interested more in the overall theme of the email rather than the specific formatting.

To accomplish this, you can make use something like the following:

(with-open-file (stream parse-file :external-format :latin1)
….)

Thanks to nikodemus on Freenode for this information.

2. Clean up the data

Emacs gives a fairly nice way of handling this, well kinda. When you load the questionable file, you can type C-x RET f, and set the file encoding. I used utf-8-unix at first. Form that, save the file. You should be presented with a warning saying that some stuff can’t be encoded with that file system, blah blah, blah. You can see a listing, a minor list anyways, of characters that it’s complaining about. Cancel the save with C-g, switch to the warning buffer with C-x o and copy each individual character (C-space right arrow C-w). You can either hit enter at this point to view the first occurance of that character, or you can go to the original buffer and search. Once you made your determination of what that encoding char should be, simply hit M-< to go to the beginning, type M-x string-replace, C-y to insert that character, hit enter, then your substitution character of your wish. It’ll replace all occurances in that buffer with what you want. From there, rinse and repeat for the others.

The obvious disadvantage of this is it’ll take much longer to accomplish the task. The advantage is that you’ll end up with a sane file in the end. I started with this method, but went with method 1 in the end.

The one part I couldn’t figure out how to do, and I’ll likely post an update once I get this answered is when you’re trying to save the buffer with overriding the encoding system – it saves it as raw-text-unix, regardless of what I picked. Given an override, the warning states that I’d just lose those characters, which I was OK with. I’ll try to find out more and post later.