函数式编程艺术 / 05 Map/Filter/Reduce
05 Map/Filter/Reduce
“map/filter/reduce 是函数式编程的三驾马车——掌握它们,你就掌握了函数式数据处理的基础。”
5.1 Map(映射)
map 对集合中的每个元素应用一个函数,返回新的集合。
5.1.1 类型签名
map :: (a -> b) -> [a] -> [b]
-- 接收一个 a→b 的函数和一个 [a] 列表
-- 返回一个 [b] 列表
5.1.2 各语言实现
Haskell:
-- 内置 map
map (*2) [1, 2, 3, 4] -- [2, 4, 6, 8]
map toUpper "hello" -- "HELLO"(String 即 [Char])
map length ["hello", "world"] -- [5, 5]
-- 自定义 map
myMap :: (a -> b) -> [a] -> [b]
myMap _ [] = []
myMap f (x:xs) = f x : myMap f xs
JavaScript:
// 基本 map
[1, 2, 3, 4].map(x => x * 2); // [2, 4, 6, 8]
['hello', 'world'].map(s => s.toUpperCase()); // ['HELLO', 'WORLD']
// 对象数组 map
const users = [
{ name: 'Alice', age: 30 },
{ name: 'Bob', age: 25 },
];
users.map(u => ({ ...u, age: u.age + 1 }));
// async map
const asyncMap = async (fn, arr) =>
Promise.all(arr.map(fn));
const results = await asyncMap(
async (url) => fetch(url).then(r => r.json()),
['/api/users', '/api/posts']
);
Python:
# 使用 map 函数
list(map(lambda x: x * 2, [1, 2, 3, 4])) # [2, 4, 6, 8]
# 列表推导式(Python 风格)
[x * 2 for x in [1, 2, 3, 4]] # [2, 4, 6, 8]
# 嵌套 map
matrix = [[1, 2], [3, 4], [5, 6]]
[list(map(lambda x: x * 2, row)) for row in matrix]
# [[2, 4], [6, 8], [10, 12]]
Rust:
// 迭代器 map
let doubled: Vec<i32> = vec![1, 2, 3, 4].iter().map(|x| x * 2).collect();
// [2, 4, 6, 8]
// Option map
let some_number = Some(5);
let doubled = some_number.map(|x| x * 2); // Some(10)
// Result map
let ok_value: Result<i32, &str> = Ok(5);
let doubled = ok_value.map(|x| x * 2); // Ok(10)
Clojure:
(map #(* 2 %) [1 2 3 4]) ;; (2 4 6 8)
(map clojure.string/upper-case ["hello" "world"]) ;; ("HELLO" "WORLD")
;; 多集合 map
(map + [1 2 3] [10 20 30]) ;; (11 22 33)
5.1.3 Map 的定律
| 定律 | 表达式 | 含义 |
|---|---|---|
| 恒等 | map id xs ≡ xs | 映射恒等函数不变 |
| 组合 | map (f . g) xs ≡ map f (map g xs) | 先 map g 再 map f 等于一次 map 组合 |
-- 恒等律
map id [1, 2, 3] == [1, 2, 3] -- True
-- 组合律
map ((*2) . (+1)) [1, 2, 3] == map (*2) (map (+1) [1, 2, 3]) -- True
5.2 Filter(过滤)
filter 根据谓词函数保留满足条件的元素。
5.2.1 类型签名
filter :: (a -> Bool) -> [a] -> [a]
5.2.2 各语言实现
Haskell:
filter even [1..10] -- [2, 4, 6, 8, 10]
filter (>5) [1..10] -- [6, 7, 8, 9, 10]
filter (\c -> c /= ' ') "hello world" -- "helloworld"
-- 自定义 filter
myFilter :: (a -> Bool) -> [a] -> [a]
myFilter _ [] = []
myFilter f (x:xs)
| f x = x : myFilter f xs
| otherwise = myFilter f xs
JavaScript:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10].filter(x => x % 2 === 0);
// [2, 4, 6, 8, 10]
const users = [
{ name: 'Alice', active: true, age: 30 },
{ name: 'Bob', active: false, age: 25 },
{ name: 'Charlie', active: true, age: 35 },
];
// 多条件 filter
users
.filter(u => u.active)
.filter(u => u.age >= 30);
// [{ name: 'Alice', ... }, { name: 'Charlie', ... }]
// reject(filter 的反面)
const reject = (fn, arr) => arr.filter((...args) => !fn(...args));
reject(x => x > 3, [1, 2, 3, 4, 5]); // [1, 2, 3]
Python:
list(filter(lambda x: x % 2 == 0, range(1, 11)))
# [2, 4, 6, 8, 10]
# 列表推导式
[x for x in range(1, 11) if x % 2 == 0]
# [2, 4, 6, 8, 10]
# 多条件
users = [
{'name': 'Alice', 'active': True, 'age': 30},
{'name': 'Bob', 'active': False, 'age': 25},
{'name': 'Charlie', 'active': True, 'age': 35},
]
[u for u in users if u['active'] and u['age'] >= 30]
Rust:
let evens: Vec<i32> = (1..=10).filter(|x| x % 2 == 0).collect();
// [2, 4, 6, 8, 10]
// 组合 filter
let result: Vec<i32> = (1..=100)
.filter(|x| x % 2 == 0)
.filter(|x| x % 3 == 0)
.collect();
// [6, 12, 18, 24, ...]
Clojure:
(filter even? (range 1 11)) ;; (2 4 6 8 10)
(filter #(> % 5) (range 1 11)) ;; (6 7 8 9 10)
;; remove 是 filter 的反面
(remove even? (range 1 11)) ;; (1 3 5 7 9)
5.2.3 Filter 的定律
| 定律 | 表达式 | 含义 |
|---|---|---|
| 恒等 | filter (const True) xs ≡ xs | 保留所有元素不变 |
| 空 | filter (const False) xs ≡ [] | 过滤掉所有元素 |
| 幂等 | filter p (filter p xs) ≡ filter p xs | 多次过滤同条件结果不变 |
| 分配 | filter p (map f xs) ≡ map f (filter (p . f) xs) | 先 map 再 filter 等价 |
5.3 Reduce/Fold(归约)
reduce/fold 将集合归约为单个值,是最强大的列表操作——map 和 filter 都可以用 fold 实现。
5.3.1 左折叠 vs 右折叠
| 操作 | 类型签名 | 求值顺序 | 结合性 |
|---|---|---|---|
| foldl (左折叠) | (b -> a -> b) -> b -> [a] -> b | 从左到右 | 尾递归优化 |
| foldr (右折叠) | (a -> b -> b) -> b -> [a] -> b | 从右到左 | 惰性求值友好 |
foldl (+) 0 [1, 2, 3] → ((0 + 1) + 2) + 3 = 6
foldr (+) 0 [1, 2, 3] → 1 + (2 + (3 + 0)) = 6
5.3.2 各语言实现
Haskell:
-- foldl'(严格左折叠,推荐使用)
import Data.List (foldl')
foldl' (+) 0 [1, 2, 3, 4, 5] -- 15
foldl' (*) 1 [1, 2, 3, 4, 5] -- 120
-- foldr(右折叠)
foldr (:) [] [1, 2, 3] -- [1, 2, 3](append)
foldr (\x acc -> acc ++ [x]) [] [1, 2, 3] -- [3, 2, 1](反转)
-- 用 fold 实现 map
myMap :: (a -> b) -> [a] -> [b]
myMap f = foldr (\x acc -> f x : acc) []
-- 用 fold 实现 filter
myFilter :: (a -> Bool) -> [a] -> [a]
myFilter p = foldr (\x acc -> if p x then x : acc else acc) []
-- 用 fold 实现 length
myLength :: [a] -> Int
myLength = foldl' (\acc _ -> acc + 1) 0
-- 用 fold 实现 reverse
myReverse :: [a] -> [a]
myReverse = foldl' (flip (:)) []
JavaScript:
// 左折叠
[1, 2, 3, 4, 5].reduce((acc, x) => acc + x, 0); // 15
// 用 reduce 实现 map
const map = (fn, arr) =>
arr.reduce((acc, x) => [...acc, fn(x)], []);
// 用 reduce 实现 filter
const filter = (fn, arr) =>
arr.reduce((acc, x) => fn(x) ? [...acc, x] : acc, []);
// 用 reduce 实现 flat
const flatten = (arr) =>
arr.reduce((acc, x) =>
Array.isArray(x) ? [...acc, ...flatten(x)] : [...acc, x], []);
// 复杂示例:统计词频
const wordFrequency = (text) =>
text.toLowerCase().split(/\s+/).reduce((freq, word) => ({
...freq,
[word]: (freq[word] || 0) + 1,
}), {});
Python:
from functools import reduce
# 左折叠
reduce(lambda acc, x: acc + x, [1, 2, 3, 4, 5], 0) # 15
# 用 reduce 实现 map
def my_map(fn, xs):
return reduce(lambda acc, x: acc + [fn(x)], xs, [])
# 用 reduce 实现 filter
def my_filter(fn, xs):
return reduce(lambda acc, x: acc + [x] if fn(x) else acc, xs, [])
# 统计词频
def word_frequency(text):
return reduce(
lambda freq, word: {**freq, word: freq.get(word, 0) + 1},
text.lower().split(),
{}
)
Rust:
// fold(左折叠)
let sum: i32 = (1..=5).fold(0, |acc, x| acc + x); // 15
// 用 fold 实现 filter + map
let result: Vec<i32> = (1..=10).fold(Vec::new(), |mut acc, x| {
if x % 2 == 0 {
acc.push(x * x);
}
acc
});
// iter().fold vs collect
// fold 更灵活,collect 更简洁
let sum1: i32 = (1..=5).fold(0, |acc, x| acc + x);
let sum2: i32 = (1..=5).sum(); // 等价
Clojure:
(reduce + [1 2 3 4 5]) ;; 15
(reduce + 0 [1 2 3 4 5]) ;; 15(带初始值)
;; 用 reduce 实现 map
(defn my-map [f xs]
(reduce (fn [acc x] (conj acc (f x))) [] xs))
;; 用 reduce 实现 filter
(defn my-filter [pred xs]
(reduce (fn [acc x] (if (pred x) (conj acc x) acc)) [] xs))
;; 统计词频
(defn word-frequency [text]
(reduce
(fn [freq word] (update freq word (fnil inc 0)))
{}
(clojure.string/split (clojure.string/lower-case text) #"\s+")))
5.3.3 Reduce 的定律
| 定律 | 表达式 | 含义 |
|---|---|---|
| 左恒等 | foldl (flip (:)) [] xs ≡ reverse xs | 左折叠 cons 等于反转 |
| 右恒等 | foldr (:) [] xs ≡ xs | 右折叠 cons 等于原列表 |
| fold/build | foldr k z (build g) ≡ g k z | GHC 融合优化 |
5.4 Scan(扫描)
scan 类似 reduce,但返回每一步的中间结果。
5.4.1 类型签名
scanl :: (b -> a -> b) -> b -> [a] -> [b]
scanr :: (a -> b -> b) -> b -> [a] -> [b]
5.4.2 各语言实现
Haskell:
scanl (+) 0 [1, 2, 3, 4] -- [0, 1, 3, 6, 10]
scanl (*) 1 [1, 2, 3, 4] -- [1, 1, 2, 6, 24]
-- 实用示例:累计和
runningSums :: Num a => [a] -> [a]
runningSums = scanl1 (+)
runningSums [1, 2, 3, 4] -- [1, 3, 6, 10]
JavaScript:
// 实现 scan
const scan = (fn, initial) => (arr) =>
arr.reduce((acc, x) => {
acc.push(fn(acc[acc.length - 1], x));
return acc;
}, [initial]);
const runningSum = scan((a, b) => a + b, 0);
runningSum([1, 2, 3, 4]); // [0, 1, 3, 6, 10]
// RxJS 中的 scan
import { from, scan } from 'rxjs';
from([1, 2, 3, 4]).pipe(
scan((acc, x) => acc + x, 0)
).subscribe(console.log);
// 输出: 0, 1, 3, 6, 10
Python:
from itertools import accumulate
list(accumulate([1, 2, 3, 4])) # [1, 3, 6, 10]
list(accumulate([1, 2, 3, 4], lambda a, b: a * b)) # [1, 2, 6, 24]
Rust:
let sums: Vec<i32> = vec![1, 2, 3, 4]
.iter()
.scan(0, |acc, &x| { *acc += x; Some(*acc) })
.collect();
// [1, 3, 6, 10]
Clojure:
;; Clojure 没有内置 scan,但可以实现
(defn scanl [f init xs]
(reductions f init xs))
(reductions + 0 [1 2 3 4]) ;; (0 1 3 6 10)
5.5 管道操作(Pipeline)
管道操作是将一系列变换串联起来的模式,数据从左到右流过每个变换。
5.5.1 各语言的管道
Haskell:
-- 使用 $ 和 &
import Data.Function ((&))
-- $ 从右到左
result1 = sum . map (*2) . filter even $ [1..10]
-- 等价于
result1 = sum (map (*2) (filter even [1..10]))
-- & 从左到右(管道风格)
result2 = [1..10]
& filter even
& map (*2)
& sum
JavaScript:
// 管道函数
const pipe = (...fns) => (x) => fns.reduce((v, f) => f(v), x);
const process = pipe(
arr => arr.filter(x => x % 2 === 0),
arr => arr.map(x => x * 2),
arr => arr.reduce((sum, x) => sum + x, 0)
);
process([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]); // 60
// TC39 Pipeline Operator (Stage 2)
// result = [1,2,3,4,5,6,7,8,9,10]
// |> arr => arr.filter(x => x % 2 === 0)
// |> arr => arr.map(x => x * 2)
// |> arr => arr.reduce((sum, x) => sum + x, 0)
Python:
# 使用 pipe 库
from pipe import select, where
result = (
range(1, 11)
| where(lambda x: x % 2 == 0)
| select(lambda x: x * 2)
)
# 手动管道
def pipeline(*fns):
return lambda x: reduce(lambda v, f: f(v), fns, x)
process = pipeline(
lambda xs: [x for x in xs if x % 2 == 0],
lambda xs: [x * 2 for x in xs],
sum
)
process(range(1, 11)) # 60
Rust:
// 迭代器链式调用就是管道
let result: i32 = (1..=10)
.filter(|x| x % 2 == 0)
.map(|x| x * 2)
.sum();
// result = 60
// 使用 tap(inspect)调试
let result: Vec<i32> = (1..=10)
.filter(|x| x % 2 == 0)
.inspect(|x| println!("after filter: {}", x))
.map(|x| x * 2)
.inspect(|x| println!("after map: {}", x))
.collect();
Clojure:
;; 线程宏 -> (管道)
(->> (range 1 11)
(filter even?)
(map #(* 2 %))
(reduce +))
;; 60
;; -> 用于第一个参数位置
(-> " Hello World "
clojure.string/trim
clojure.string/lower-case
(clojure.string/split #"\s+"))
;; ["hello" "world"]
5.6 Transducer(转换器)
Transducer 是 composable(可组合的)算法转换,与集合的遍历策略解耦。
5.6.1 核心思想
传统 map/filter: 每次操作都创建中间集合
Transducer: 将所有转换合并为一次遍历,无中间集合
5.6.2 Clojure 中的 Transducer
;; 传统方式:三次遍历
(->> (range 1000000)
(filter even?) ;; 中间集合 1
(map #(* 2 %)) ;; 中间集合 2
(take 10)) ;; 中间集合 3
;; Transducer 方式:一次遍历
(transduce
(comp
(filter even?)
(map #(* 2 %))
(take 10))
conj
[]
(range 1000000))
;; Transducer 可以复用到不同上下文
(def xform (comp (filter even?) (map #(* 2 %))))
;; 用于向量
(into [] xform (range 100))
;; 用于惰性序列
(sequence xform (range 100))
;; 用于异步通道
(let [ch (async/chan 10 xform)]
(async/>!! ch 1)
(async/<!! ch))
5.6.3 JavaScript 中的 Transducer
// 简化的 Transducer 实现
const map = (fn) => (reducer) => (acc, x) => reducer(acc, fn(x));
const filter = (pred) => (reducer) => (acc, x) =>
pred(x) ? reducer(acc, x) : acc;
const compose = (...fns) => (x) =>
fns.reduceRight((acc, fn) => fn(acc), x);
const append = (acc, x) => [...acc, x];
// 组合 Transducer
const xform = compose(
filter(x => x % 2 === 0),
map(x => x * 2)
);
// 应用到数组
const transduce = (xform, reducer, init, coll) =>
coll.reduce(xform(reducer), init);
transduce(xform, append, [], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
// [4, 8, 12, 16, 20]
5.6.4 性能对比
| 方式 | 遍历次数 | 中间分配 | 内存使用 |
|---|---|---|---|
| 链式 map/filter | N 次 | N-1 个中间集合 | 高 |
| 手动 for 循环 | 1 次 | 无 | 低 |
| Transducer | 1 次 | 无 | 低 |
5.7 FlatMap(展平映射)
FlatMap 先映射再展平一层。
5.7.1 各语言实现
Haskell:
-- concatMap = concat . map
concatMap (\x -> [x, x*2]) [1, 2, 3]
-- [1, 2, 2, 4, 3, 6]
-- 也叫 >>= (bind)
-- Maybe monad 示例
safeDivide :: Int -> Int -> Maybe Int
safeDivide _ 0 = Nothing
safeDivide a b = Just (a `div` b)
result :: Maybe Int
result = Just 10 >>= \x -> safeDivide x 2 -- Just 5
JavaScript:
// flatMap
[1, 2, 3].flatMap(x => [x, x * 2]); // [1, 2, 2, 4, 3, 6]
// 展平嵌套数组
[[1, 2], [3, 4], [5, 6]].flatMap(x => x); // [1, 2, 3, 4, 5, 6]
// 处理可选值
const users = [
{ name: 'Alice', email: 'alice@example.com' },
{ name: 'Bob', email: null },
{ name: 'Charlie', email: 'charlie@example.com' },
];
const emails = users.flatMap(u => u.email ? [u.email] : []);
// ['alice@example.com', 'charlie@example.com']
Python:
from itertools import chain
# 使用 chain.from_iterable 展平
list(chain.from_iterable([[1, 2], [3, 4], [5, 6]]))
# [1, 2, 3, 4, 5, 6]
# flatMap
list(chain.from_iterable([[x, x*2] for x in [1, 2, 3]]))
# [1, 2, 2, 4, 3, 6]
Rust:
// flat_map
let result: Vec<i32> = vec![1, 2, 3]
.iter()
.flat_map(|&x| vec![x, x * 2])
.collect();
// [1, 2, 2, 4, 3, 6]
// Option 的 flat_map
let result: Option<i32> = Some(10)
.flat_map(|x| if x > 5 { Some(x * 2) } else { None });
// Some(20)
Clojure:
(mapcat (fn [x] [x (* x 2)]) [1 2 3])
;; (1 2 2 4 3 6)
5.8 其他常用操作
5.8.1 操作速查表
| 操作 | Haskell | JavaScript | Python | Rust | Clojure |
|---|---|---|---|---|---|
| 映射 | map | .map() | map() | .map() | map |
| 过滤 | filter | .filter() | filter() | .filter() | filter |
| 归约 | foldl' | .reduce() | reduce() | .fold() | reduce |
| 扫描 | scanl | (自定义) | accumulate() | .scan() | reductions |
| 展平 | concat | .flat() | chain() | .flatten() | flatten |
| 扁平映射 | concatMap | .flatMap() | chain() | .flat_map() | mapcat |
| 存在 | any | .some() | any() | .any() | some |
| 全部 | all | .every() | all() | .all() | every? |
| 查找 | find | .find() | next(filter()) | .find() | some |
| 计数 | length | .length | len() | .count() | count |
| 分组 | groupBy | Object.groupBy | groupby() | (itertools) | group-by |
| 去重 | nub | [...new Set()] | set() | .unique() | distinct |
5.8.2 分组与聚合
JavaScript:
const orders = [
{ product: 'Laptop', category: 'Electronics', price: 999 },
{ product: 'Mouse', category: 'Electronics', price: 25 },
{ product: 'Book', category: 'Education', price: 30 },
{ product: 'Pen', category: 'Education', price: 5 },
];
// 分组
const groupByCategory = (orders) =>
orders.reduce((groups, order) => ({
...groups,
[order.category]: [...(groups[order.category] || []), order],
}), {});
// 聚合
const salesByCategory = (orders) =>
Object.entries(groupByCategory(orders)).map(([category, items]) => ({
category,
totalSales: items.reduce((sum, i) => sum + i.price, 0),
count: items.length,
}));
Haskell:
import qualified Data.Map as Map
import Data.List (groupBy, sortOn)
import Data.Function (on)
groupByCategory :: [Order] -> Map.Map String [Order]
groupByCategory = Map.fromListWith (++) . map (\o -> (category o, [o]))
salesByCategory :: [Order] -> [(String, Double, Int)]
salesByCategory orders =
let grouped = Map.toList $ groupByCategory orders
in map (\(cat, items) -> (cat, sum (map price items), length items)) grouped
5.9 业务场景
5.9.1 日志分析管道
// 日志分析管道
const analyzeLogs = (logs) => {
const parseLog = (line) => {
const [timestamp, level, ...msgParts] = line.split(' ');
return { timestamp, level, message: msgParts.join(' ') };
};
return logs
.map(parseLog) // 解析
.filter(log => log.level === 'ERROR') // 只保留错误
.map(log => ({ // 提取关键信息
...log,
module: log.message.match(/\[(\w+)\]/)?.[1],
}))
.reduce((stats, log) => ({ // 统计
...stats,
[log.module]: (stats[log.module] || 0) + 1,
}), {});
};
// 使用
const logs = [
'2026-01-01 ERROR [Auth] Login failed for user alice',
'2026-01-01 INFO [Auth] Login success for user bob',
'2026-01-01 ERROR [DB] Connection timeout',
'2026-01-01 ERROR [Auth] Token expired',
];
analyzeLogs(logs); // { Auth: 2, DB: 1 }
5.9.2 电商报表
from functools import reduce
from itertools import groupby
from operator import itemgetter
def generate_sales_report(orders):
# 管道:筛选已完成订单 → 按类别分组 → 计算统计
completed = [o for o in orders if o['status'] == 'completed']
# 按月份分组
def key_func(o):
return o['date'][:7] # YYYY-MM
completed.sort(key=key_func)
reports = []
for month, group in groupby(completed, key=key_func):
items = list(group)
reports.append({
'month': month,
'total_orders': len(items),
'total_revenue': sum(o['total'] for o in items),
'avg_order_value': sum(o['total'] for o in items) / len(items),
'top_category': max(
reduce(lambda acc, o: {**acc, o['category']: acc.get(o['category'], 0) + o['total']},
items, {}).items(),
key=lambda x: x[1]
)[0],
})
return reports
5.10 注意事项
| 注意事项 | 说明 |
|---|---|
| 惰性 vs 急切 | Haskell 默认惰性,其他语言通常急切 |
| 中间集合开销 | 链式调用产生中间集合,考虑使用 Transducer |
| 短路求值 | any/all/find 可以提前终止 |
| 并行化 | map 天然可并行,reduce 需要结合律 |
| 错误处理 | 大规模数据流中考虑 Either/Result |
5.11 小结
| 要点 | 说明 |
|---|---|
| Map | 转换每个元素,保持长度不变 |
| Filter | 根据谓词筛选元素 |
| Reduce/Fold | 将集合归约为单个值 |
| Scan | 像 reduce 但保留中间结果 |
| Transducer | 可组合的算法转换,无中间集合 |
| 定律 | 恒等律、组合律、分配律保证可预测性 |
扩展阅读
- Transducers: Efficient Data Processing Pipelines in JavaScript — Eric Elliott
- Clojure Transducers — Rich Hickey
- Functors, Applicatives, And Monads In Pictures — Aditya Bhargava
下一章:06 模式匹配 — 强大的数据解构工具