18 个 Laravel 8 数据库查询优化建议

Summer

摈弃世俗浮躁，追求技术精湛

转载自 Laravel 论坛： https:// learnku.com/laravel/t/6 1384

如果应用运行缓慢或存在大量数据库查询，请按照以下性能优化提示来缩短应用的加载时间。

1. 检索大型数据集

本提示主要侧重于提高处理大型数据集时应用的内存使用率。

处理大的集合时，分组检索结果处理，而不是一次性检索处理。

如下展示了从 posts 表检索数据的过程。

$posts = Post::all(); // 使用 eloquent
$posts = DB::table('posts')->get(); // 使用查询构造器
 foreach ($posts as $post){
 // 处理 posts 操作
}

上面的例子会从 posts 表检索所有的记录并处理。如果这个表达到了 100 多万行呢？内存将很快被耗尽。

为了避免在处理大型数据集时出现问题，我们可以检索结果子集并按照下面的方式处理它们。

选项 1: 使用 chunk

// 当使用 eloquent 时
$posts = Post::chunk(100, function($posts){
    foreach ($posts as $post){
     // Process posts
 // 当使用查询构造器时
$posts = DB::table('posts')->chunk(100, function ($posts){
    foreach ($posts as $post){
     // Process posts
});

以上例子从 posts 表中检索 100 条记录对其进行处理，另外再检索 100 条记录进行处理。此迭代将继续，直到处理完所有记录。

这种方法将创建更多的数据库查询，但内存效率会更高。通常, 大型数据集的处理应该再后台进行。因此，可以在后台运行时进行更多查询，以避免在处理大型数据集时耗尽内存。

选项 2: 使用游标

// 使用 eloquent
foreach (Post::cursor() as $post){
   // 处理单个 post
 // 使用 query 构建器
foreach (DB::table('posts')->cursor() as $post){
   // 处理单个 post
}

示例进行单个数据库查询，检索表的所有记录，一个接一个一个处理 Eloquent 模型。这种方式仅查询一次数据库，得到全部 posts 。但使用 php 生成器优化内存使用。

什么情况使用这个呢？

这能够在应用层极大地优化内存使用，由于我们检索表的所有数据，数据库内存占用任然很高。

在数据库内存较多，应用内存较少的时候，建议使用游标。然而，如果你的数据库没有足够的内存，最好使用 chunks 。

选项 3: 使用 chunkById

// 使用 eloquent
$posts = Post::chunkById(100, function($posts){
    foreach ($posts as $post){
     // 处理 posts
 // 使用 query 构造器
$posts = DB::table('posts')->chunkById(100, function ($posts){
    foreach ($posts as $post){
     // 处理 posts
});

chunk 和 chunkById 最大的区别是 chunk 通过 offset 和 limit 检索数据。然而 chunkById 通过 id 字段检索结构。id 字段通常是整型字段，而且它也是自增字段。

chunk 和 chunkById 的查询如下。

chunk

select * from posts offset 0 limit 100
select * from posts offset 101 limit 100

chunkById

select * from posts order by id asc limit 100
select * from posts where id > 100 order by id asc limit 100

通常，查询使用 limit 和 offset 是较慢的，尽量避免使用。本文详细介绍使用 offset 的问题。

chunkById 使用 id 整型字段，通过 where clause 查询，这样会更快。

什么时候使用 chunkById ？

当数据库存在自增 主键 的时候使用。

2. 选择合适的列

通常从数据库检索数据时，会像下面这样做。

$posts = Post::find(1); // 使用 eloquent
$posts = DB::table('posts')->where('id','=',1)->first(); // 使用 query 构建器

上面的代码会得到如下的查询

select * from posts where id = 1 limit 1

select * 表示从表中查出所有列。当需要所有列时，这没有问题。

然而，仅需要指定的列（id，title）时，只需要像下面这样检索那些列。

$posts = Post::select(['id','title'])->find(1); // 使用 eloquent
$posts = DB::table('posts')->where('id','=',1)->select(['id','title'])->first(); // 使用 query 构建器

上面代码得到如下查询

select id,title from posts where id = 1 limit 1

3. 当需要数据库表的一两个列时

这点主要关注对检索结果的处理时间。这不影响实际的查询时间。

如我上面提到的，检索指定的列，可以这样做

$posts = Post::select(['title','slug'])->get(); // 使用 eloquent
$posts = DB::table('posts')->select(['title','slug'])->get(); // 使用 query 构建器

执行上面的代码，它会在幕后执行以下操作。

执行 select title, slug from posts 查询
检索出的每一行对应一个 Post 模型对象（对 PHP 对象）(query 构建器得到标准的 PHP 对象)
为 Post 模型生成 collection
返回 collection

访问数据

foreach ($posts as $post){
    // $post 是 Post 模型或  php 标准对象
    $post->title;
    $post->slug;
}

上面的方式有额外的开销，为每一行创建 Post 模型，并为这些对象创建一个集合。如果的确需要 Post 模型实例而不是数据，这是最正确的做法。

但如果您只需要两个值时，则可以执行以下操作：

$posts = Post::pluck('title', 'slug'); // 使用 eloquent 时
$posts = DB::table('posts')->pluck('title','slug'); // 使用查询构造器时

当上面代码被执行时，它在幕后会执行以下操作。

对数据库执行 select title, slug from posts 查询
创建一个数组，其中会以 title 作为 数组值 ， slug 作为 数组键
返回数组 ( 数组格式： [ slug => title, slug => title ] )

要访问结果，我们可以这么做

foreach ($posts as $slug => $title){
    // $title 是 post 的 title
    // $slug 是 post 的 slug
}

如果您想检索一列，您可以这么做

$posts = Post::pluck('title'); // 使用 eloquent 时
$posts = DB::table('posts')->pluck('title'); // 使用查询构造器时
foreach ($posts as  $title){
    // $title 是 post 的 title
}

上面的方式消除了每一行 Post 对象的创建。这将降低查询结果处理的内存和时间消耗。

建议在新代码中使用上述方式。个人感觉不值得花时间遵循上面的提示重构代码。重构代码，最好是在要处理大的数据集或者是比较闲的时候

4. 使用查询代替 collection 来统计行数

统计表的行数，通常这样做

$posts = Post::all()->count(); // 使用 eloquent
$posts = DB::table('posts')->get()->count(); // 使用查询构造器

这将生成以下查询

select * from posts

上述方法将从表中检索所有行。将它们加载到 collection 对象中并计算结果。当数据表中的行较少时，这可以正常工作。但随着表的增长，内存很快就会耗尽。

与上述方法不同，我们可以直接计算数据库本身的总行数。

$posts = Post::count(); // 使用 eloquent 时
$posts = DB::table('posts')->count(); // 使用查询构造器时

这将生成以下查询

select count(*) from posts

在 sql 中计算行数是一个缓慢的过程，当数据库表中有多行时性能会很差。最好尽量避免计算行数。

5. 通过即时加载关系避免 n + 1查询

这条建议你可能听说过无数次了。所以我会尽可能简短。让我们假设您有以下场景

class PostController extends Controller
    public function index()
        $posts = Post::all();
        return view('posts.index', ['posts' => $posts ]);
// posts/index.blade.php 文件
 @foreach($posts as $post)
        <h3>{{ $post->title }}</h3>
        <p>Author: {{ $post->author->name }}</p>
@endforeach

上面的代码是检索所有的帖子，并在网页上显示帖子标题和作者，假设帖子模型关联 作者 。

执行以上代码将导致运行以下查询。

select * from posts // 假设返回5条数据
select * from authors where id = { post1.author_id }
select * from authors where id = { post2.author_id }
select * from authors where id = { post3.author_id }
select * from authors where id = { post4.author_id }
select * from authors where id = { post5.author_id }

如上，1 条查询来检索帖子，5 条查询来检索帖子的作者（假设有 5 篇帖子）。因此对于每篇帖子，都会进行一个单独的查询来检索它的作者。

所以如果有 N 篇帖子，将会产生 N+1 条查询（1 条查询检索帖子，N 条查询检索每篇帖子的作者）。这常被称作 N+1 查询问题。

避免这个问题，可以像下面这样预加载帖子的作者。

$posts = Post::all(); // Avoid doing this
$posts = Post::with(['author'])->get(); // Do this instead

执行上面的代码得到下面的查询：

select * from posts // Assume this query returned 5 posts
select * from authors where id in( { post1.author_id }, { post2.author_id }, { post3.author_id }, { post4.author_id }, { post5.author_id } )

6. 预加载嵌套关系

从上面的例子，考虑作者归属于一个组，同时需要显示组的名字的情况。因此在 blade 文件中，可以按下面这样做。

@foreach($posts as $post)
        <h3>{{ $post->title }}</h3>
        <p>Author: {{ $post->author->name }}</p>
        <p>Author's Team: {{ $post->author->team->name }}</p>
@endforeach

接着

$posts = Post::with(['author'])->get();

得到下面的查询：

select * from posts // Assume this query returned 5 posts
select * from authors where id in( { post1.author_id }, { post2.author_id }, { post3.author_id }, { post4.author_id }, { post5.author_id } )
select * from teams where id = { author1.team_id }
select * from teams where id = { author2.team_id }
select * from teams where id = { author3.team_id }
select * from teams where id = { author4.team_id }
select * from teams where id = { author5.team_id }

如上，尽管预加载了 authors 关系，仍然产生了大量的查询。这是因为没有预加载 authors 上的 team 关系。

通过下面这样来解决这个它。

$posts = Post::with(['author.team'])->get();

执行得到下面的查询。

select * from posts // Assume this query returned 5 posts
select * from authors where id in( { post1.author_id }, { post2.author_id }, { post3.author_id }, { post4.author_id }, { post5.author_id } )
select * from teams where id in( { author1.team_id }, { author2.team_id }, { author3.team_id }, { author4.team_id }, { author5.team_id } )

通过预加载嵌套关系，可以将查询数从 11 减到 3。

7. 如果仅需要 id 时，别预加载 belongsTo 关系

想象一下，有 posts 和 authors 两张表。帖子表有 author_id 列归属作者表。

为了得到帖子的作者 id，通常这样做

$post = Post::findOrFail(<post id>);
$post->author->id;

执行得到两个查询。

select * from posts where id = <post id> limit 1
select * from authors where id = <post author id> limit 1

然而，可以直接通过下面方式得到作者 id 。

$post = Post::findOrFail(<post id>);
$post->author_id; // 帖子表有存放作者 id 的 author_id 列

什么时候采取上面的方式？
采取上的方式，需要确保帖子关联的作者在作者表始终存在。

8. 避免使用不必要的查询

很多时候，一些数据库查询是不必要的。看看下面的例子。

<?php
 class PostController extends Controller
    public function index()
        $posts = Post::all();
        $private_posts = PrivatePost::all();
        return view('posts.index', ['posts' => $posts, 'private_posts' => $private_posts ]);
}

上面代码是从两张不同的表（ posts ， private_posts ）检索数据，然后传到视图中。视图文件如下。

// posts/index.blade.php
 @if( request()->user()->isAdmin() )
    <h2>Private Posts</h2>
        @foreach($private_posts as $post)
                <h3>{{ $post->title }}</h3>
                <p>Published At: {{ $post->published_at }}</p>
        @endforeach
@endif
 <h2>Posts</h2>
    @foreach($posts as $post)
            <h3>{{ $post->title }}</h3>
            <p>Published At: {{ $post->published_at }}</p>
    @endforeach
</ul>

正如你上面看到的， $private_posts 仅对 管理员 用户可见，其他用户都无法看到这些帖子。

问题是，当我们在做

$posts = Post::all();
$private_posts = PrivatePost::all();

我们进行两次查询。一次从 posts 表获取记录，另一次从 private_posts 表获取记录。

private_posts 表的记录仅 管理员用户 可见。但我们仍在查询以检索所有用户记录，即使它们不可见。

我们可以调整逻辑，避免额外的查询。

$posts = Post::all();
$private_posts = collect();
if( request()->user()->isAdmin() ){
    $private_posts = PrivatePost::all();
}

将逻辑更改为上述内容后，我们对管理员用户进行了两次查询，并对其他用户进行了一次查询。

9. 合并相似的查询

我们有时需要进行查询以同一个表中检索不同类型的行。

$published_posts = Post::where('status','=','published')->get();
$featured_posts = Post::where('status','=','featured')->get();
$scheduled_posts = Post::where('status','=','scheduled')->get();

上述代码正从同一个表检索状态不同的行。代码将进行以下查询。

select * from posts where status = 'published'
select * from posts where status = 'featured'
select * from posts where status = 'scheduled'

如您所见，它正在对同一个表进行三次不同的查询以检索记录。我们可以重构此代码以仅进行一次数据库查询。

$posts =  Post::whereIn('status',['published', 'featured', 'scheduled'])->get();
$published_posts = $posts->where('status','=','published');
$featured_posts = $posts->where('status','=','featured');
$scheduled_posts = $posts->where('status','=','scheduled');
select * from posts where status in ( 'published', 'featured', 'scheduled' )

上面的代码生成一个查询来检索全部特定状态的帖子，通过状态为返回的帖子创建不同的 collections 。三个不同的状态的变量由一个查询生成。

10. 为常查询的列添加索引

如果查询中含有 where 条件作用于 string 类型的 column ，最好给这列添加索引。通过这列的查询将会快很多。

$posts = Post::where('status','=','published')->get();

上面例子，我们对 status 列添加 where 条件来查询。可以通过下面这样的数据库迁移来优化查询。

Schema::table('posts', function (Blueprint $table) {