Message ID | 20160524184720.GA27449@debian-netbook |
---|---|
State | New |
Headers |
Received: (qmail 73327 invoked by uid 89); 24 May 2016 18:47:57 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.99.1 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.7 required=5.0 tests=BAYES_00, BODY_8BITS, GARBLED_BODY, RP_MATCHES_RCVD, SPF_PASS autolearn=no version=3.3.2 spammy=PORT, 1788, pipes, 178, 8 X-Spam-Status: No, score=0.7 required=5.0 tests=BAYES_00, BODY_8BITS, GARBLED_BODY, RP_MATCHES_RCVD, SPF_PASS autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: X-HELO: lists.gnu.org Received: from lists.gnu.org (HELO lists.gnu.org) (208.118.235.17) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 24 May 2016 18:47:47 +0000 Received: from localhost ([::1]:54913 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <guix-devel-bounces+patchwork=sourceware.org@gnu.org>) id 1b5HMn-0006Fk-00 for patchwork@sourceware.org; Tue, 24 May 2016 14:47:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43832) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <efraim@flashner.co.il>) id 1b5HMe-0006ER-Ul for guix-devel@gnu.org; Tue, 24 May 2016 14:47:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <efraim@flashner.co.il>) id 1b5HMa-0008HP-Vz for guix-devel@gnu.org; Tue, 24 May 2016 14:47:36 -0400 Received: from flashner.co.il ([178.62.234.194]:51332) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <efraim@flashner.co.il>) id 1b5HMa-00084v-K2; Tue, 24 May 2016 14:47:32 -0400 Received: from debian-netbook (85.65.166.137.dynamic.barak-online.net [85.65.166.137]) by flashner.co.il (Postfix) with ESMTPSA id 5B12440211; Tue, 24 May 2016 18:47:22 +0000 (UTC) Date: Tue, 24 May 2016 21:47:20 +0300 From: Efraim Flashner <efraim@flashner.co.il> To: guix-devel@gnu.org, Ludovic =?utf-8?B?PT91dGYtOD9RP0NvdXJ0PUMzPUE4cz89?= <ludo@gnu.org>, Eric Bavier <ericbavier@openmailbox.org> Subject: Adding wc to Bournish Message-ID: <20160524184720.GA27449@debian-netbook> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="WhfpMioaduB5tiZL" Content-Disposition: inline User-Agent: Mutt/1.6.1 (2016-04-27) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 178.62.234.194 X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." <guix-devel.gnu.org> List-Unsubscribe: <https://lists.gnu.org/mailman/options/guix-devel>, <mailto:guix-devel-request@gnu.org?subject=unsubscribe> List-Archive: <http://lists.gnu.org/archive/html/guix-devel/> List-Post: <mailto:guix-devel@gnu.org> List-Help: <mailto:guix-devel-request@gnu.org?subject=help> List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guix-devel>, <mailto:guix-devel-request@gnu.org?subject=subscribe> Errors-To: guix-devel-bounces+patchwork=sourceware.org@gnu.org Sender: "Guix-devel" <guix-devel-bounces+patchwork=sourceware.org@gnu.org> |
Commit Message
Efraim Flashner
May 24, 2016, 6:47 p.m. UTC
GSoC just started and so far I have 13 tabs from the guile manual open on my computer. I almost implemented `du' at the same time as `wc' but at the time I was getting tripped up with functions returning functions or returning values. `wc-l' and `wc-w' are two separate functions ATM. It should be possible to combine the two and increment the line-count if the delimiter is \n (or EOF) but I didn't want to spend too much time implementing and re-implementing the commands. `wc-w' closes the file after it finishes but `wc-l' doesn't. Good hygiene? Let the garbage collector take care of it? I also made liberal use of `make-string', this seems like a memory leak nightmare to me. I wrapped `file-size' and `wc-command' in checks that the file actually exists and can be read by the user. One of the things drilled into me from turning in assignments in school was to wrap the code in checks so it'd pass the TAs checks. `wc-command', it doesn't accept arguments and I'm not happy with the method of printing out the results but it works. bournish@(guile-user)> wc gpl-3.0.txt 674 5644 35147 gpl-3.0.txt What should I add to my .guile to auto-load guix.scm from the repo? (load /path/to/guix/guix.scm) didn't work for me.
Comments
Efraim Flashner <efraim@flashner.co.il> writes: > +(define (wc-command file) > + (if (and (file-exists? file) (access? file 4)) > + (let* ((wc-l ((@@ (guix build bournish) wc-l-command) file)) > + (wc-w ((@@ (guix build bournish) wc-w-command) file)) > + (wc-c ((@@ (guix build bournish) wc-c-command) file))) > + (begin > + (display wc-l)(display #\space) > + (display wc-w)(display #\space) > + (display wc-c)(display #\space) > + (display file) > + (newline))))) Have you considered using “format” instead of “display”? The last “begin” could just be this: (format #t "~a ~a ~a ~a\n" wc-l wc-w wc-c file) or (format #t "~{~a ~}\n" (list wc-l wc-w wc-c file)) Why is it necessary to do “(@@ (guix build bournish) ...)”? Can it not be imported once? ~~ Ricardo
Efraim Flashner <efraim@flashner.co.il> writes: > +(define (wc-l-command file) > + (let* ((input-file (open-file file "r")) > + (line (read-line input-file)) > + (line-count 0)) > + (while (not (eof-object? line)) > + (set! line-count (1+ line-count)) > + (set! line (read-line input-file))) > + line-count)) It’s unusual for me to see the use of “while” and “set!” in Scheme code. You could do this in a functional manner using a fold (see SRFI-1) or with file streams (see SRFI-41), which also provides a stream-fold. The idea with a fold is that you have a function that takes a value (e.g. from a list or a stream) and an intermediate result. The function does something to the value and then returns a new intermediate result. Here’s a fold over a list of symbols implementing a count: (fold (lambda (_ res) (+ res 1)) ; increase the result 0 ; start at 0 '(hello world bye)) ; items to count If you had a file stream, where each element represents one line, you can fold over all lines in much the same way to get a count. You could use the same framework with a different stream element generator (reading one word or byte at a time instead of one line at a time) to implement the other features of “wc”. There’s an example of how to define a file stream in the Guile manual in the documentation for SRFI-41. ~~ Ricardo
On Wed, May 25, 2016 at 11:03:34AM +0200, Ricardo Wurmus wrote: > > Efraim Flashner <efraim@flashner.co.il> writes: > > > +(define (wc-command file) > > + (if (and (file-exists? file) (access? file 4)) > > + (let* ((wc-l ((@@ (guix build bournish) wc-l-command) file)) > > + (wc-w ((@@ (guix build bournish) wc-w-command) file)) > > + (wc-c ((@@ (guix build bournish) wc-c-command) file))) > > + (begin > > + (display wc-l)(display #\space) > > + (display wc-w)(display #\space) > > + (display wc-c)(display #\space) > > + (display file) > > + (newline))))) > > Have you considered using “format” instead of “display”? The last > “begin” could just be this: > > (format #t "~a ~a ~a ~a\n" > wc-l wc-w wc-c file) > > or > > (format #t "~{~a ~}\n" > (list wc-l wc-w wc-c file)) I looked at the manual more and I found for-each which does what I was looking for before which gave me this: (for-each display (list wc-l #\space wc-w #\space wc-c #\space file)) but I do like the (format #t "~{~a ~}\n" (list ...)) better. I'm working on trying to get it to read flags and the variable size of the list works better with that. format vs display, I knew I wanted display over write but I hadn't come across format yet. If I don't need to specify #\space to get a space then that sounds great. > > Why is it necessary to do “(@@ (guix build bournish) ...)”? Can it not > be imported once? When you say it out loud then it seems obvious it shouldn't need to be imported more than once. I've also read more about let vs let* and I've changed some to let. > > ~~ Ricardo Thanks!
On Wed, May 25, 2016 at 11:26:11AM +0200, Ricardo Wurmus wrote: > > Efraim Flashner <efraim@flashner.co.il> writes: > > > +(define (wc-l-command file) > > + (let* ((input-file (open-file file "r")) > > + (line (read-line input-file)) > > + (line-count 0)) > > + (while (not (eof-object? line)) > > + (set! line-count (1+ line-count)) > > + (set! line (read-line input-file))) > > + line-count)) > > It’s unusual for me to see the use of “while” and “set!” in Scheme code. > You could do this in a functional manner using a fold (see SRFI-1) or > with file streams (see SRFI-41), which also provides a stream-fold. I'm going to put set! in the GOTO box. GOTO worked well enough when I learned QBasic, but just because its implemented doesn't mean I should still use it. (ie: its there but try not to use it) > > The idea with a fold is that you have a function that takes a value > (e.g. from a list or a stream) and an intermediate result. The function > does something to the value and then returns a new intermediate result. I already like this much better > > Here’s a fold over a list of symbols implementing a count: > > (fold > (lambda (_ res) (+ res 1)) ; increase the result > 0 ; start at 0 > '(hello world bye)) ; items to count > > If you had a file stream, where each element represents one line, you > can fold over all lines in much the same way to get a count. You could > use the same framework with a different stream element generator > (reading one word or byte at a time instead of one line at a time) to > implement the other features of “wc”. I'll take another look at streams and rework that. I still think asking the filesystem how big the file is works well for `wc -c', but `wc -l' and `wc -w' should really be part of the same function. > > There’s an example of how to define a file stream in the Guile manual in > the documentation for SRFI-41. That looks like a good place to start then :) > > ~~ Ricardo
Hello! Efraim Flashner <efraim@flashner.co.il> skribis: > GSoC just started and so far I have 13 tabs from the guile manual open > on my computer. I almost implemented `du' at the same time as `wc' but > at the time I was getting tripped up with functions returning functions > or returning values. Nice! Some comments to complement what Ricardo et al. wrote: > +(define (wc-w-command file) > + (let* ((input-file (open-file file "r")) > + (line (make-string 80)) > + (word-size (read-delimited! " \t\n" line input-file)) > + (word-count 0)) > + (while (not (port-closed? input-file)) > + (if (not (zero? word-size)) > + (set! word-count (1+ word-count))) > + (set! line (make-string 80)) > + (if (not (eof-object? (peek-char input-file))) > + (set! word-size (read-delimited! " \t\n" line input-file)) > + (close-input-port input-file))) > + word-count)) First, regarding the port not being closed, it’s a good idea to use ‘call-with-input-file’ instead of ‘open-file’: ‘call-with-input-file’ makes sure the port gets closed when its “dynamic extent” is left, that is, regardless of whether we leave the procedure “normally” or via an exception. (call-with-input-file file (lambda (port) …)) Also, imperative programming Should Be Avoided; see “Coding Style” in the manual. ;-) We don’t use an i/o monad or any such thing, but ‘set!’ and string modifications is definitely frowned upon. As Ricardo suggests, you could use ‘port->stream’ and ‘stream-fold’ to iterate over the characters read from the port. I suspect that’d be rather slow though, at least on 2.0, so another option is something like: (define (lines+chars port) ;; Return the number of lines and number of chars read from PORT. (let loop ((lines 1) (chars 0)) (match (read-char port) ((? eof-object?) ;done! (values lines port)) (#\newline ;recurse (loop (+ 1 lines) (+ 1 chars))) (_ ;recurse (loop lines (+ 1 chars)))))) (define (wc-command file) (let-values (((lines chars) (call-with-input-file file lines+chars))) (format #t "~a ~a ~a~%" lines chars file))) > +(define (wc-command file) > + (if (and (file-exists? file) (access? file 4)) This check is not needed and is subject to a race condition (“TOCTTOU”); just let ‘call-with-input-file’ error out if the file cannot be read. Bonus point: catch ‘system-error’ exceptions and report the inability to open the file in a nice user-friendly way (but really, don’t bother about it for now.) > + (let* ((wc-l ((@@ (guix build bournish) wc-l-command) file)) > + (wc-w ((@@ (guix build bournish) wc-w-command) file)) > + (wc-c ((@@ (guix build bournish) wc-c-command) file))) > + (begin > + (display wc-l)(display #\space) > + (display wc-w)(display #\space) > + (display wc-c)(display #\space) > + (display file) > + (newline))))) Remember that Bournish is a compiler that compiles Bash to Scheme. So we must distinguish the support functions that are used at run time, such as ‘ls-command-implementation’, from what the Scheme code that the compiler emits (compile time). In the case of ‘ls’, when the compiler encounters ‘ls’ in the input, it emits this code: ((@@ (guix build bournish) ls-command-implementation)) ‘ls-command-implementation’ is the implementation that is called when we run the compiled program. Thus, you must similarly distinguish those two stages by providing: 1. A ‘wc-command-implementation’ procedure that implements ‘wc’; 2. A ‘wc-command’ procedure that emits the code that calls ‘wc-command-implementation’; so something like: (define (wc-command args) `((@@ (guix build bournish) wc-command-implementation) ,@args)) Better yet, ‘wc-command’ could check for the presence of “-l” or “-c” at compile time and emit a call to the right thing. HTH! Ludo’.
On Thu, May 26, 2016 at 10:46:21AM +0200, Ludovic Courtès wrote: > > As Ricardo suggests, you could use ‘port->stream’ and ‘stream-fold’ to > iterate over the characters read from the port. I suspect that’d be > rather slow though, at least on 2.0, so another option is something > like: > > (define (lines+chars port) > ;; Return the number of lines and number of chars read from PORT. > (let loop ((lines 1) (chars 0)) > (match (read-char port) > ((? eof-object?) ;done! > (values lines port)) > (#\newline ;recurse > (loop (+ 1 lines) (+ 1 chars))) > (_ ;recurse > (loop lines (+ 1 chars)))))) > > (define (wc-command file) > (let-values (((lines chars) > (call-with-input-file file lines+chars))) > (format #t "~a ~a ~a~%" lines chars file))) > Are you suggesting just dropping the word count part of `wc'? I've been thinking about it, and the simplest way I can think of to describe a word is a space followed by a character, or, to use the char-sets from the guile manual, a character from char-set:whitespace followed by a character from char-set:graphic. I can compare (read-char port) and (peek-char port) to get a word count (possibly). > > +(define (wc-command file) > > + (if (and (file-exists? file) (access? file 4)) > > This check is not needed and is subject to a race condition (“TOCTTOU”); > just let ‘call-with-input-file’ error out if the file cannot be read. > > Bonus point: catch ‘system-error’ exceptions and report the inability to > open the file in a nice user-friendly way (but really, don’t bother > about it for now.) > I'm still wrapping my head around the following part. My wife says when I work I scowl at the computer a lot and mutter :) > > Remember that Bournish is a compiler that compiles Bash to Scheme. > So we must distinguish the support functions that are used at run time, > such as ‘ls-command-implementation’, from what the Scheme code that the > compiler emits (compile time). > > In the case of ‘ls’, when the compiler encounters ‘ls’ in the input, it > emits this code: > > ((@@ (guix build bournish) ls-command-implementation)) > > ‘ls-command-implementation’ is the implementation that is called when we > run the compiled program. > > Thus, you must similarly distinguish those two stages by providing: > > 1. A ‘wc-command-implementation’ procedure that implements ‘wc’; > > 2. A ‘wc-command’ procedure that emits the code that calls > ‘wc-command-implementation’; so something like: > > (define (wc-command args) > `((@@ (guix build bournish) wc-command-implementation) > ,@args)) > > Better yet, ‘wc-command’ could check for the presence of “-l” or > “-c” at compile time and emit a call to the right thing. I checked with coreutil's 'wc', and it emits in its particular order whether you call 'wc -l -c' or 'wc -lc' or 'wc -cl' > > HTH! > > Ludo’.
diff --git a/guix/build/bournish.scm b/guix/build/bournish.scm index 4022796..336a650 100644 --- a/guix/build/bournish.scm +++ b/guix/build/bournish.scm @@ -1,5 +1,6 @@ ;;; GNU Guix --- Functional package management for GNU ;;; Copyright © 2016 Ludovic Courtès <ludo@gnu.org> +;;; Copyright © 2016 Efraim Flashner <efraim@flashner.co.il> ;;; ;;; This file is part of GNU Guix. ;;; @@ -103,6 +104,54 @@ characters." ((@ (guix build utils) dump-port) port (current-output-port)) *unspecified*))) +(define (file-contents file) + `(make-stream + (port->stream (open-input-file ,file) read-char))) + +(define (file-size file) + (if (and (file-exists? file) + (access? file 4)) + (stat:size (stat file)) + *unspecified*)) + +(define (wc-c-command file) + ((@@ (guix build bournish) file-size) file)) + +(define (wc-w-command file) + (let* ((input-file (open-file file "r")) + (line (make-string 80)) + (word-size (read-delimited! " \t\n" line input-file)) + (word-count 0)) + (while (not (port-closed? input-file)) + (if (not (zero? word-size)) + (set! word-count (1+ word-count))) + (set! line (make-string 80)) + (if (not (eof-object? (peek-char input-file))) + (set! word-size (read-delimited! " \t\n" line input-file)) + (close-input-port input-file))) + word-count)) + +(define (wc-l-command file) + (let* ((input-file (open-file file "r")) + (line (read-line input-file)) + (line-count 0)) + (while (not (eof-object? line)) + (set! line-count (1+ line-count)) + (set! line (read-line input-file))) + line-count)) + +(define (wc-command file) + (if (and (file-exists? file) (access? file 4)) + (let* ((wc-l ((@@ (guix build bournish) wc-l-command) file)) + (wc-w ((@@ (guix build bournish) wc-w-command) file)) + (wc-c ((@@ (guix build bournish) wc-c-command) file))) + (begin + (display wc-l)(display #\space) + (display wc-w)(display #\space) + (display wc-c)(display #\space) + (display file) + (newline))))) + (define (help-command . _) (display "\ Hello, this is Bournish, a minimal Bourne-like shell in Guile! @@ -129,7 +178,8 @@ commands such as 'ls' and 'cd'; it lacks globbing, pipes---everything.\n")) ("help" ,help-command) ("ls" ,ls-command) ("which" ,which-command) - ("cat" ,cat-command))) + ("cat" ,cat-command) + ("wc" ,wc-command))) (define (read-bournish port env) "Read a Bournish expression from PORT, and return the corresponding Scheme