Linux Shell 下求两个文件交集和差集的方式

发布时间：2021-12-15 20:24:14 所属栏目：教程来源：互联网

导读：假设两个文件FILE1和FILE2用集合A和B表示，FILE1内容如下： a b c e d a FILE2内容如下： c d a c 基本上有两个方法，一个是comm命令，一个是grep命令。分别介绍如下： comm命令， Compare sorted files FILE1 and FILE2 line by line. With no options, pro

假设两个文件FILE1和FILE2用集合A和B表示，FILE1内容如下：

a
b
c
e
d
a
FILE2内容如下：

c
d
a
c
基本上有两个方法，一个是comm命令，一个是grep命令。分别介绍如下：

comm命令， Compare sorted files FILE1 and FILE2 line by line. With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files. 要注意两个文件必须是排序和唯一(sorted and unique)的，默认输出为三列，第一列为是A-B，第二列B-A，第三列为A交B。

直接运行结果如下：

$ comm a.txt b.txt
a
b
                c
        d
        a
        c
e
d
a
仅仅排序：

$ comm <(sort a.txt ) <(sort b.txt )
                a
a
b
                c
        c
                d
e
排序并且唯一：

$ comm <(sort a.txt|uniq ) <(sort b.txt|uniq )
                a
b
                c
                d
e
如果只想要交集，如下即可：

$ comm -12 <(sort a.txt|uniq ) <(sort b.txt|uniq )
a
c
d
至于差集，读者自己思考了。

grep 命令是常用的搜索文本内容的，要找交集，如下即可：

p$ grep -F -f a.txt b.txt
c
d
a
c
grep不要求排序，但是因为是集合操作，唯一是必须的（不然怎么是集合呢？）。所以：

$ grep -F -f a.txt b.txt | sort | uniq
a
c
d
差集呢？

$ grep -F -v -f a.txt b.txt | sort | uniq
$ grep -F -v -f b.txt a.txt | sort | uniq
b
e
第一行结果为B-A，所以为空；第二行为A-B。注意顺序很重要！

（编辑：开发网_郴州站长网）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!

电脑坏了该怎么修复？	电脑关机后出现蓝屏怎
如何运用开源 Dism++	更改DNF模块后dnf怎么