alistair23-linux

redonkable

Author	SHA1	Message	Date
Andrey Zhizhikin	a6f0cb3ff6	This is the 5.4.91 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmAHFkkACgkQONu9yGCS aT5DMg//TWHV1loe76Jy6mT7SavddKkO+C6YXdGMYN4vVKJqYzASSqqmkIGYZVOj G5GnILybNjA9aJIqX4vXTXs3YslWZN+rd//GYRyBTE7SwlNI8Lho1ZJq8VqtWo+x jxm+2QNX8wBb9QuCqsnLOVidWVOQ9dcz0GC6/N8gKcAWJ71B2RpwKQxnEXjlJp3f m5cX+Vnm3XnJkdT4mmycV3h4gnOrwhIUGbu8iLbPTmfZf5aZ14eD2Su8gpcunWat 7JY2z1u4jSpkKspG5eVn8wmL1aB5+WhkqU5+rOtHZ+KJZvRY0wTnmIQEBCw0bAW+ 49tIthuJF8wC7oa3hXoXMNG8K112ffeeF2Hm29WFbpFYRinIjGt/MPmg2A1sM+C1 jVQewVOArNLA0lo5m1jun2/c56EEGFKKODzJR7Epphdi+bsY7DSttIfIIzwUqTc5 9wgZG81+l9uP/ohTm7vG8hQcANt0DN+X8wet+HqpuO5Mj5T6150dKW4zQhdOljBH GL/O/31DfIUmLJL50+X6kn47c0noZlwEmZc+buVxdO5bC27cK6awEE3gQeCTgsWj Ok1Sa+3FwwEPnKs8zInYP69U/obvNxBhdxrccrUOViGBxsXKHMPEnXG2bUuiV/7v KnuO9z1Pj3+YAdZTwWygdJcZNdCAwGL4ekQV9N/Pxeg6ejq2E3Q= =TOgX -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEdQaENiSDAlGTDEbB7G51OISzHs0FAmAHVv8ACgkQ7G51OISz Hs16CQ//Y47jRspSDzkvfrtOEyTgatnuEmHyAT44CeEMCg1mSp9ESao3fP9e+SPu qzNGeAKdlsI9FxA3T0N4HohuOU6fxnKPcD+cGISgM4/5lwfVXm27g8M4TqkyCoqm MRwNUIF5vdKOW6czEDFdMRBpoNMEvB9gnWTx0QRBX7ngV6zXwwfdh0WIEuFsSe4P LRFMTIrUfCcSxYKT/QRm2k2Jah/h+MdXs19Ig0djcwqHCltGIeYM8hq31DLL8U6Q qSS8l5Ql5IE3rozGTdV93ySCr1apsrHiV6dNN00rVsHKfNQjiq0YJD6jjyCP1d+R e4RAUPREhODv+J27qYKRp3bSovKJvIn6FFcyjgE2ScGDhNfZalw0XiClwmC/QFUX nsDZVtW9dAePJVS2gsWKrUSPwGT7posr3cc5coYeZ/i4p3NjDhqsUl3AerLaYHqq VSKNjwT38yHTelB/6BaDdw8eHkaSuLGOIp2CA6HaGoS4xPiDTqvI+AZROFMWcK/m gV/VuRlfB8vLObjN7zxolsHT+UT3+WM8hR3sDCEyU3NsJvVxeuxTJtL2qb4zcrRR hoDeX5ATOoiKyx56VQR5V/iwr31RXzxFPMRxOc3SePYiZAuJAfy/+SIfquL8sJ4u HP8Bx5jieCbLsCazyk0X2QfA6fCsI+avOrOwY9PAWdsB6XKqx8s= =IdJz -----END PGP SIGNATURE----- Merge tag 'v5.4.91' into 5.4-2.2.x-imx This is the 5.4.91 stable release Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>	2021-01-19 22:02:36 +00:00
Jan Kara	bc68af1fdc	bfq: Fix computation of shallow depth [ Upstream commit `6d4d273588` ] BFQ computes number of tags it allows to be allocated for each request type based on tag bitmap. However it uses 1 << bitmap.shift as number of available tags which is wrong. 'shift' is just an internal bitmap value containing logarithm of how many bits bitmap uses in each bitmap word. Thus number of tags allowed for some request types can be far to low. Use proper bitmap.depth which has the number of tags instead. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-19 18:26:15 +01:00
Andrey Zhizhikin	873af59ba2	This is the 5.4.90 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmAENzgACgkQONu9yGCS aT7khA//eTBSPP1vAJIqph0YgQbgCCzvzQTj5enM6F1cCZqVha8s0ZjY4fl9Mkky MTVmQdGEem4MoqypzFgAQPQn8KpoM//sQue+b9evny3wU/cmgry5Hs7H3F1/Y7Yv q27Q5jzRTmvcy4Up21FhpFE58FXCXiO5H58FrtKEuJtoCxk+akyGuF8Z0UH3Rvp/ FTKjAKnfzQ9b3MjBJY16W3EqZnpLB+sFMhimS+QyHAr4biTXgIhM/ZebyKxYOGDw fq9MX5XCSM5Aka9RfWIGl8FF5y1IICkBQ0Il+xI7zsQwONFD9UIMhAcTE2LxybQT YsV/GJ7r/nZWSTcup+vD+tTNceXQoBY2EDGIKeX3rNme8cLWWJeDbTc7KbIkIi35 ctRFeEcUiFMoQEhIXyi7c8DcOU4xjmTUXtigjhcLLzAODuOBriWbIsM81RuLwNGC i/jLYEWhQ+tXozLsmb1/7fL8mvAlZfD3Vwkm4aTSSPul1i52tqBnRZBSut0+KRMa +SOpxytl+H5tFV6Z3bI0lrtJ0xnKdr0oJj367JsxIG1yeOpkqe8CEFWW+14TsjqV R1ETqDTtqi8YTGfIgp4Q3EUe9LdoJwUQFKh1lv0SMKYac6vtz/C+MxziJXHPValE dNK3MocE1zpfMgnZpHP/IwbLOeiWfNl+ZL/wpD73EUr1PvUiRvQ= =4Noe -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEdQaENiSDAlGTDEbB7G51OISzHs0FAmAEuNMACgkQ7G51OISz Hs0XMw//UEpM14N+UcLb3yTHYyr/vksat/Rz0fT3R2bBMgmlxmTsUgDv5OZLiXlO mWMbN2k1kDvKl4EuiJfmOK+5FICoFkdBpXugomjLFI0BeJukywdz6l4omDTqHkE7 dK3ntPdpTVFP0zJI7lMOk5DsKW/GprOOeFaw0PEO8255yoo59l8Ayxi3k6rWU3f+ a1laEtpwyitIFBAW4KCOT50/f1vpVnqlhB+2fRpFUwGBceuv42p0fBTZ7vi98Mmu KtKRVHnPKQIROx07oFUlxeeNsyhRacJXXoSvyyfBGc9pf3vs8nCGXhITXSVuQHsh EZh4Ur007Q40sDDaH7FkC0jxqGOc8A/oCAwF9WCVoWy0yx0t/CKpMcgsoBl2H3MC v0/DusVQQod1ohcH0lG6fxM3roKkbbOF2HelRDnz+n7mw9biboKKL3Q+i2CNUmam KutJixqfW2NsfZo87ObSmn2iv7xcDAZWLO4axMgAfygTdoL39v/Ws7ZJ3IEujh/a zKAx9Mb4/TAv1OlmI+3wpsq+efSZ90fUAXnW+ymJazh+sJj/Do+61RcgoyDrG6U+ SGfIa/rbrnG44jywPisAdLMjtg4YBC3ccSh0oqdbO3lrXyT+afL7XLuH8BfT/rYY e33E+xb4qOc4/bfWCFt/rVyZR4PoOo4TJHyoFugQ+N32U7V921c= =UIOp -----END PGP SIGNATURE----- Merge tag 'v5.4.90' into 5.4-2.2.x-imx This is the 5.4.90 stable release Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>	2021-01-17 22:23:12 +00:00
Ming Lei	bd0051a5cb	block: fix use-after-free in disk_part_iter_next commit `aebf5db917` upstream. Make sure that bdgrab() is done on the 'block_device' instance before referring to it for avoiding use-after-free. Cc: <stable@vger.kernel.org> Reported-by: syzbot+825f0f9657d4e528046e@syzkaller.appspotmail.com Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-17 14:05:38 +01:00
Andrey Zhizhikin	732c2c9b8f	This is the 5.4.87 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl/1wNYACgkQONu9yGCS aT64cxAAwwt2H911zFagJCVDfLKXJ4da062n0YcJe3saGSg+mdEkSGYEDxjV6jjM jTzK1W5C49sQ9kzIF43YnYgdULwcXJ76G/uqFjFOlmbRzAKAYgs/3KXesa7S4cp+ LT0fiR7uyViOw1zn4yBIeSnax8uRwT4vR1vV++ILC/7vL6hcnOBOPLxGzUKYlvJQ TD8ZQjeTXe5E7IhE+ztuhJQT+hZr1VERTjoktcfmlUps94uITeKdKYoCCZQ/zYIL IS7OgnAw5RNERHa1JUZruaGFvJORTu8wAfVtgD1VgRUZAe2ziWH6aCeDPaWaLzS5 3U7Rc3Fyf0CRYrhe7mI1J864GIEUAe9V34sGQzaU/ap4SWpLvHbu12ePlb+nLNKF MZmGEd0eZuKKDSx9dlcx8hbfVg99YpI5oOeDvfCJpYx/uxNzzJhO5wkkZxweiN9s XTMUhhkTNkhgYdzn4Y8G9++LLAZpwOImSh3NkntoH+mSVlC+jVBbskz6PdywDjQR ROVpW26t5Ee6uDTrjci5cffbfje2y0r9km5/sbRWUz2YGsqYfAI3FtbH5isNUPOm Q6ucTd+xvmApfp9bn+XYLnbTQEGAD6mAgSmO11CIDsUJUvOTD/2cv861kATJqhXm 01rHgohIG604vERppYC3WWFjh0cdevBvwSOpDi1LIdlgbEF6QY0= =q0Fm -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEdQaENiSDAlGTDEbB7G51OISzHs0FAl/1128ACgkQ7G51OISz Hs0fQA/9H97uijXtAPPozFsgUXqnFwKRphIxuiqV3eFRMVTGdpQitEH3bMycS1Xv +ZEMLD1O1pFcG1om7i35rf3yCGCaobn1nZpZo1RQq8nYcmnNWw15lNskTFuSLMb7 B3+iZg9zIC/KY02EjHRcltoPWydRrNMGK5I8Vyt7S5idfSbSD40EnxVSPbENnyFA YXHqN3IqoDQHH8+A1TfVxpFui60s1W6NrwNgslIhvW4TIi9hj93KbqT3eaQJxKKc jADJoRDXwXTTTTR4F6Yq2xwfpmm/vSRQUlrAU/WnvsrdI+9AqvoPuqB3AOGEUo6t z8gwF3PYJ+RsWpmSunEKH5ZTkuIVDIFFQu8L3VPS278OfjlHqWMM4EpgSa4lLWRx l4YItszQex/h0hUMJ1Tnf+0pMtq05HtyOiQGrMqGQFIHPVs/fO2aC4BQAIe7qd24 L8hIZwZj3+8z4Ec1bpvnSzKABW09mcvC5guPg6XgF/Ywsp2VohhZlqTirk/tER+v NhgzwzBTv97iU99WKeUf41UPjl6/CHx1EpgqA75Oa+eW4r0Mxsztc1kbkvxFVhEB 4thvAsVG8+csGxd0HK8McXwwnPBVNDVQ6Mxhn6E5TZJSVC+huu/sjK4RRc9BXJD+ eDQGgdW20gj/mynxr14kQ3/fULpZg50KOdBAVkFF7c4zL/Ogym0= =4lm1 -----END PGP SIGNATURE----- Merge tag 'v5.4.87' into 5.4-2.2.x-imx This is the 5.4.87 stable release Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>	2021-01-06 15:29:48 +00:00
Bart Van Assche	af07e4dd07	scsi: block: Fix a race in the runtime power management code commit `fa4d0f1992` upstream. With the current implementation the following race can happen: * blk_pre_runtime_suspend() calls blk_freeze_queue_start() and blk_mq_unfreeze_queue(). * blk_queue_enter() calls blk_queue_pm_only() and that function returns true. * blk_queue_enter() calls blk_pm_request_resume() and that function does not call pm_request_resume() because the queue runtime status is RPM_ACTIVE. * blk_pre_runtime_suspend() changes the queue status into RPM_SUSPENDING. Fix this race by changing the queue runtime status into RPM_SUSPENDING before switching q_usage_counter to atomic mode. Link: https://lore.kernel.org/r/20201209052951.16136-2-bvanassche@acm.org Fixes: `986d413b7c` ("blk-mq: Enable support for runtime power management") Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Stanley Chu <stanley.chu@mediatek.com> Co-developed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:48:37 +01:00
Andrey Zhizhikin	8c8c2d4715	This is the 5.4.86 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl/sW9MACgkQONu9yGCS aT5SwBAAo6dgHqwmPfuf98/8oVeVqTxcmE7GpzpVRH2+yI7Zwk2ez29tAflcM7lT LKtR2WFGAxoCL4DUKXeO7Ubwpue5NoBIsJ8/dAYBesojps3WDaFGL55PvJLWwFJ7 5gPtPzynITaqIC1JCFcrJ7OTp7REiCUZRc1CJXJINWAYL1VbEbH8pH904xfFcivy XnNyL9UiWp1lSB8oF3CRJOaK5M5gY1+wdCFaLVqQn306XDEM8PvZK4G3at/jXWgH jQjArdtC8M8NwjyTwtqW9JAMV+6CD0/HXk0QboTZg6yiaRrtUsfzMqJ1cvhKcQgO kLE3rwdnr3/MxuzSnGWbswflG2WCutoah58g0uN8H0nCiui5mKN6x5K+emgDZIoO ndDnh+/5OE247EK+3CGn/0N8i/fOymrLAnLL4wCXVdlQLMCalnL37ibdfGbAptXi N3GOGZ2iEglvTsEr5w0r86+AzNskm5EqA7mFGFiAyf9viR2xwYk3RrWf2ZyMRos2 2S7mKcZmw7voDu2TIDIhqydToBKxmYI/mUn3mFFme1h3lwzM3zYG1aovVLfd5NkY Gx5E/CA/ut/3n0u/dXJ8SxEitBWkqImp5UdYcElQNxQoXnVU4yKmjf6dDL9Wqh+1 ujCiaCUJd3PY0uXXIb6RWWGs2VaL4xiEnk+ZBm0VI9WEUWksSx0= =jnmv -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEdQaENiSDAlGTDEbB7G51OISzHs0FAl/yR4IACgkQ7G51OISz Hs2KfA//e3jp0Ah8krkhVUDaiKJ6cWwt3PyEyEuASV+KB7bf8dW25E98jEriBTBX kEgmy8ZWJqcDvxP7JWshmjSu6y7IfzFhrNMt3Fsd3ZsQz7nUprzokufnUSMcoNeu vXYtJHslunsxnUVOzy/iCAjb0KU7zS5lYxxITAGth1vgEM5QySXcBZx8yWrGxNt+ Hk5Rc4hQlogmh4Mi1t9VoHOafy5smitOwVGtcl8oPiDCkoConXtBvNQgFkncBZWf 0EOXiulRkWeo/KXMVrdVy8J1IzWjQDDM1/JDY/Xx6scLnBBCJ002Yfv/HpL/toAM K5/dYJmRktlsCaKFd14uMTAnEqhjnDyPtxntOa0Z4YExfOGwum/SmOMvQyCGLOJg eF5HejriqRfK2bRBpYO92aXdwmBSuu2cS3AXroHw3tl3VQ+9uzTNzp9iIonsKjSJ 5WQPc+0Ebg5NPtHHkimeUTFcxmYfqOFV2u+wVDi9Lcfm7xzJ8zM7w/IyM6sMpdoZ xKQ0jto8KN0eQKWmih2GL/pde2iihjOPa6RYuRom7LCLMgjJvBMCIQJwUZBg1PU5 0eh7WipmOt1xCfKWi3HX3A+2ibkdT/3CkochK26Md/mrCH4aI4JSacr3lXYTALhi yy37o4ogWgCwDGOmxCPfG2BxZzLywNIBJb3qwT3maDPMkMT0efs= =IdA4 -----END PGP SIGNATURE----- Merge tag 'v5.4.86' into 5.4-2.2.x-imx This is the 5.4.86 stable release Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>	2021-01-03 22:38:54 +00:00
Douglas Anderson	f7e6636831	blk-mq: In blk_mq_dispatch_rq_list() "no budget" is a reason to kick [ Upstream commit `ab3cee3762` ] In blk_mq_dispatch_rq_list(), if blk_mq_sched_needs_restart() returns true and the driver returns BLK_STS_RESOURCE then we'll kick the queue. However, there's another case where we might need to kick it. If we were unable to get budget we can be in much the same state as when the driver returns BLK_STS_RESOURCE, so we should treat it the same. It should be noted that even if we add a whole bunch of extra kicking to the queue in other patches this patch is still important. Specifically any kicking that happened before we re-spliced leftover requests into 'hctx->dispatch' wouldn't have found any work, so we really need to make sure we kick ourselves after we've done the splicing. Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:50:54 +01:00
Johannes Thumshirn	4f3e3fa623	block: factor out requeue handling from dispatch code [ Upstream commit `c92a41031a` ] Factor out the requeue handling from the dispatch code, this will make subsequent addition of different requeueing schemes easier. Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:50:54 +01:00
Damien Le Moal	9e54ca3d4f	block: Simplify REQ_OP_ZONE_RESET_ALL handling [ Upstream commit `c7a1d926dc` ] There is no need for the function __blkdev_reset_all_zones() as REQ_OP_ZONE_RESET_ALL can be handled directly in blkdev_reset_zones() bio loop with an early break from the loop. This patch removes this function and modifies blkdev_reset_zones(), simplifying the code. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:50:54 +01:00
Andrey Zhizhikin	f3029c73e4	This is the 5.4.76 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl+qe5YACgkQONu9yGCS aT6bAw//VGKqKOUOva6147u3U98FFBuYMJnZwZIxqvX4PFJnSwqKmsLUoCI8bhJV UJ+lbbBvyNbe2DS1+YkhlHTC15U7dHIWtSM4/FC7rvgTuvjAj4epqDDu5IkOoK4W Pil+zV1fwnwHrcuBbb5Ydk+mS3I/sVjObAQygluQPt1D2xESkyITq/uT9Lal0hRy fbyfUNYrhf4Bdeyfgzr7sEDrorgzQJ+7NBDR5NTzn0j0gph4hhe1z5FWmy8jEPXM kKy39nTrCu5hQhEL7L0G29ZLb0s8mhMM9B7OyKHCALtdc6VqwC3WFZqkwrr/cInQ bDuuBMngRe+n/A5xVMmsnjFyR+znXg82HYQuqrBJ1w3S4pbV+j0dcVJ9PiusyYdR n81HCakatyIq9Oe64yHKIlbxslkfgUjJX+uR4LfNS7iC4ad5fV/BwdCs0z0v2oOH o38e5V/qQFiI442+BR6fPagYEpHxJAlteZTpdUteYUBTpQ97v76K/10fqLdGc07s vevP4T2t3Z1qtswY5VbU2jOkNilgnOlqIw+VSzSXp4N8jcF+TEgtSB/X18eX69oy wQ8+aJzNjWCOFfqbYpS+1X2X/eVzBdBrQ8rk/FMKJ0Edxwm3YpoAqHb6copODzaZ cBwCyhbJbHeYpbzgJkkAJEZKffy6XWmwVqtYoi52HZNB1A5ipIA= =Cjfz -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEdQaENiSDAlGTDEbB7G51OISzHs0FAl+qrv0ACgkQ7G51OISz Hs0/nA/+MMPK70/vcX5vjIteU3vGfXCekEQemSdP50Z7yIuQS6PBn57TV/OIeDYQ S+2+rpjXQXWyq+ciqF6q0xbASk+1p9XAa0j8ZVYMOqO4GTzmOhp+kHnrVC9IE+Jn GHUNP9AQfQvjbAG08B2s7UoOf+hi+XjQrtquloDg7NvmxAYIl+/KvtsSRfZtvtyu mk8bgVJb4b787szeDlJUP0iR9mWGJ9NJcFH4c+kScY9mZBlXG3Cp31vSx8+EuZty jafA+I2413J06I9hzzTCDXVHsbsn1dfndjLjr7zOJUhqhQTl99Z9ykpgTOf4qZRp GRi58xqnFNxn2r9OBu95bXKsyjAVDxH6SiKH26cX5ILbBW3Lrh5I66iD0GJf6YGS YmuoWVJlVLeZjAyvvLBgcoMD3mGbxIQ26997861Viea/aCN3tBRB3M5pIdKAihiW J/EgMW8ufSoPFnE6UgqNHTOQRXOGB6wW221y1YzcWJ//6+oIZQkr9RRsQTjagBAn Ynen8MfIrtGgbSsgVBKhG9m9RslfMenaesxaR8N+Hezq+BH9qGEVAJvAs0G4b/38 Nk+pEsOQDZwW+6fFt0Yrcvo+B67vVe9QFJIl/6dx6H+cXEsgPDM6BsEAXU/Tc4Bw 5yYYQZHB9fGooVzlsx2twVILNq9mQpabxUT060Fs+g/T1JpsZz0= =Jdct -----END PGP SIGNATURE----- Merge tag 'v5.4.76' into 5.4-2.2.x-imx This is the 5.4.76 stable release Conflicts: - drivers/tty/serial/fsl_lpuart.c: Fix merge conflict of upstream patches [`86875e1d64`] and [`8febdfb597`], which contradicted with patch [`cde0cb39c0`] from NXP. Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>	2020-11-10 15:17:03 +00:00
Gabriel Krisman Bertazi	3c52715cea	blk-cgroup: Pre-allocate tree node on blkg_conf_prep [ Upstream commit `f255c19b3a` ] Similarly to commit `457e490f2b` ("blkcg: allocate struct blkcg_gq outside request queue spinlock"), blkg_create can also trigger occasional -ENOMEM failures at the radix insertion because any allocation inside blkg_create has to be non-blocking, making it more likely to fail. This causes trouble for userspace tools trying to configure io weights who need to deal with this condition. This patch reduces the occurrence of -ENOMEMs on this path by preloading the radix tree element on a GFP_KERNEL context, such that we guarantee the later non-blocking insertion won't fail. A similar solution exists in blkcg_init_queue for the same situation. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-11-10 12:37:30 +01:00
Gabriel Krisman Bertazi	f77756ea66	blk-cgroup: Fix memleak on error path [ Upstream commit `52abfcbd57` ] If new_blkg allocation raced with blk_policy change and blkg_lookup_check fails, new_blkg is leaked. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-11-10 12:37:29 +01:00
Andrey Zhizhikin	4c7342a1d4	This is the 5.4.73 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl+ahE8ACgkQONu9yGCS aT4j1A/9HzkKKoqZ2vXYQ1/uEnUqZech9ly1KxpNTBrSZYAtx3MaWY7tGDEx2BqD y6iw9x4MymhHEbpwLg6YmmdWuMQLNNYJGoyLiPJgWhkE4c7zHadhNz1DcPEI8F7z bSlUJ3Oebr8gzv0FvUmeVXw7Z2EuOqM1zGgTAZfnKY3DkYHbLnrzUJ4AiI8TNeba pPIhjfIJ1TvhF+s5ggf2m8OtSWLZ0doCWCPmCFe2WyERX2WYCzPgsm0yL7L7oXME ZqWpOcClBsiYekBNcZ4kxozhJtArCnv24n9VoXJ/YJIlWKvCA6uC8r527nGN/z08 dfFelj1nDs7/VrCSP4+109EjxLQnSYGgIWP0g0OsC+9wOmrQsYJ1azP1eNjm+NuC hPa8uYVEZxwVyJuEfu4ZB4NMZBlD2qnHoskvBKbyZ8yaVnbvlMp552XMwsmJBpCs 8wArzabrJEz396LUUIYG829D7NBDuRav1Miu+FTzlbn+xZ/Y/S8OmhoG2stWa4wV y5x0M0DWgrqiZ9rMkz9A03UNnCInQVTfIBoMl63xFitW4/0vLsln3+CjzlKm7H46 rD/tKACUoCDjR5DN+JwQzmTdL9zBb4p1cXwWjWb6rON3BkXmO0JVAxzurxI9PfX0 ZWDydZ3HNmrm0d3J12zf3kTX56PfPFAGWUsEc4Ntb5zdWXSQJsE= =fZ3T -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEdQaENiSDAlGTDEbB7G51OISzHs0FAl+bPfIACgkQ7G51OISz Hs1LaA/9GOp34DrKBM/z7eN7gcbU4rJUhnggZ3vCXShRs3vtjEJ7wptzJb7lErXX 6JCS/OjZpQpZJcHdBX0Kxovf8LVgvDrsbAhRhQkdFr0dESQQ4UY+vT5me9Y9Ot1F biG2z2HduLAxBgrYB2uA7VeRqlLiAa7ELS5EWB90xjY49w7gy5kK7AFrBRQdme9G r4fY89RD9sJVzo4sxgQwUYXuNJi5OmwbN+wrkwk8HXyL0tAB9SNQJ7A962Gxamao AIXT0CvNpNSkR/4JeqDXbJu54fMZxaF4A7a9mgL42fWe45jQs2zYSNx3vdZskzK8 8z+4FCmShNkGMMLV5k6Ds/lJ8uF1yOkUBJJeiHJxnpZw93xKVWZfOVmgm+WO/mNq POmJVfALFFzwvNllyMX8D++0yhORunzfhzQyKgVAthwmScGQ5TK0cerwAa9VEz1T 40e7AqsNKUxRPnoZYQwM0Y2Vskn6qZ8pOW3rSSQx2YI2lhQHGeUMAugHzYivERkF 8d5hDPaQgfJXmS+S8Xp45zafeMDjoNQFQhZLAptmoF9+NGXdJduPSPJxQ0R/29GT 2LEHNsneGotslpXwluk8x2VlXryf/7okEdR7RLq9kjEyM5BGOc2wcZrD6GXvnAf0 JorbYZPriCaNHrGDdEiRFZmRlKaiR4CbcR3JtYkFJebirmIc1N0= =8w9z -----END PGP SIGNATURE----- Merge tag 'v5.4.73' into 5.4-2.2.x-imx This is the 5.4.73 stable release Conflicts: - arch/arm/boot/dts/imx6sl.dtsi: Commit [`a1767c9019`] in NXP tree is now covered with commit [`5c4c2f437c`] from upstream. - drivers/gpu/drm/mxsfb/mxsfb_drv.c: Resolve merge hunk for patch [`ed8b90d303`] from upstream - drivers/media/i2c/ov5640.c: Patch [`aa4bb8b883`] in NXP tree is now covered by patches [`79ec0578c7`] and [`b2f8546056`] from upstream. Changes from NXP patch [`99aa4c8c18`] are covered in upstream version as well. - drivers/net/ethernet/freescale/fec_main.c: Fix merge fuzz for patch [`9e70485b40`] from upstream. - drivers/usb/cdns3/gadget.c: Keep NXP version of the file, upstream version is not compatible. - drivers/usb/dwc3/core.c: - drivers/usb/dwc3/core.h: Fix merge fuzz of patch [`08045050c6`] together wth NXP patch [`b30e41dc1e`] - sound/soc/fsl/fsl_sai.c: - sound/soc/fsl/fsl_sai.h: Commit [2ea70e51eb72a] in NXP tree is now covered with commit [`1ad7f52fe6`] from upstream. Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>	2020-10-29 22:09:24 +00:00
Tetsuo Handa	b3a0ed4110	block: ratelimit handle_bad_sector() message [ Upstream commit `f4ac712e4f` ] syzbot is reporting unkillable task [1], for the caller is failing to handle a corrupted filesystem image which attempts to access beyond the end of the device. While we need to fix the caller, flooding the console with handle_bad_sector() message is unlikely useful. [1] https://syzkaller.appspot.com/bug?id=f1f49fb971d7a3e01bd8ab8cff2ff4572ccf3092 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-29 09:58:01 +01:00
Yang Yang	450d03435c	blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue [ Upstream commit `47ce030b7a` ] blk_exit_queue will free elevator_data, while blk_mq_run_work_fn will access it. Move cancel of hctx->run_work to the front of blk_exit_queue to avoid use-after-free. Fixes: `1b97871b50` ("blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release") Signed-off-by: Yang Yang <yang.yang@vivo.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-29 09:57:34 +01:00
Andrey Zhizhikin	a17bf4eda5	This is the 5.4.70 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl99WdwACgkQONu9yGCS aT4X0hAAiyB6vsAHKTdVF+lDc2phpblY7ryi/Pe56X3ie+aqZqd/SjVQ4MhZ4lO1 p4JjUjHPByHon2DrMlvE4cVf45LF1qpu2qrGkes/WaLX4OgeAsWPq/i31aks4S7h JebCkX9UeVTLZMZ1beeqfRgsWUX75P8vhafyl5eLC5dJXzzL3G4V9Kz+LUKBuuHU FoEmJHab2olfk1G2wgb9xOlmkeKt1xLBbfW8grv5c4zWQexiXJc+6M8CsQVdErLe eK0XDbBcUMNpcCKFRyJ1NO/Y94Yui0YPQQziSHuSR+E+1PDd9roI+DbgInC82R9t aO0XTnt+9mqxpuYZNHhwa/KHOg/rv/t2Y4GFySOUwaOBGtGRWVRgJfH1AoZu+rdk OWamt8c5Uej8CpPtoXVLNblmnpPKavUd6dox8CyDGN/PPEsk0VoXvENZMjXaAA9Q L0AaKdHnk+JK5HCou5vuw1AhoItB/jbldU7qy7cprZXDS7tEuGXVldRJkU5yVWyI Z4/+ldQOAGSrgvEZz6DxxpQ/RJO1+ai/pJJXXcRu5JghlgnZHrrg1i2/EEMnCNy+ Kd/aXReVwiVX6wozrkOyeyymQaLe8wYeFWrc6vx0Z0L5cw7LVV7Oc0WflRXO0rq8 WN4qmmmL6URFwcVhHlaG8AHlaMfD/yawyb7bRIbxXwt2lHe5G1k= =JxXq -----END PGP SIGNATURE----- Merge tag 'v5.4.70' into 5.4-2.2.x-imx This is the 5.4.70 stable release Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>	2020-10-07 08:37:06 +00:00
Konstantin Khlebnikov	2334b2d5a2	block/diskstats: more accurate approximation of io_ticks for slow disks commit `2b8bd42361` upstream. Currently io_ticks is approximated by adding one at each start and end of requests if jiffies counter has changed. This works perfectly for requests shorter than a jiffy or if one of requests starts/ends at each jiffy. If disk executes just one request at a time and they are longer than two jiffies then only first and last jiffies will be accounted. Fix is simple: at the end of request add up into io_ticks jiffies passed since last update rather than just one jiffy. Example: common HDD executes random read 4k requests around 12ms. fio --name=test --filename=/dev/sdb --rw=randread --direct=1 --runtime=30 & iostat -x 10 sdb Note changes of iostat's "%util" 8,43% -> 99,99% before/after patch: Before: Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0,00 0,00 82,60 0,00 330,40 0,00 8,00 0,96 12,09 12,09 0,00 1,02 8,43 After: Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0,00 0,00 82,50 0,00 330,00 0,00 8,00 1,00 12,10 12,10 0,00 12,12 99,99 Now io_ticks does not loose time between start and end of requests, but for queue-depth > 1 some I/O time between adjacent starts might be lost. For load estimation "%util" is not as useful as average queue length, but it clearly shows how often disk queue is completely empty. Fixes: `5b18b5a737` ("block: delete part_round_stats and switch to less precise counting") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> From: "Banerjee, Debabrata" <dbanerje@akamai.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-07 08:01:29 +02:00
Andrey Zhizhikin	ee7b6ad15b	This is the 5.4.67 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl9rJlYACgkQONu9yGCS aT6WbRAAga6QVKrO6R4NeKk0fPqKQQoQeTK+phBOFA7jAoX/rIRKyob2Si9BwhBA F77vZ6HIZ7+e/o35JJQYQbffbHYs0ANuS1oHGqe0vgbh+72Viaan6g7lFOhpx8qf y0YS7q+hw4WLZB0gGlBM7nkPXRiis32IrEVabQW+t8hmT2lWyutY8E2yFAU60tvI Tvjm2c2pvHEcHz9MrjEd/jIVxMFnIl42FBTx9bGsbDUCDzBwEvPArS4bNioP7EFJ O+rrGCNvwtiv0DuKzX1UIZzQ88IROmU3ZjsIlgOwla7xJWv4QDgmPfyAyRI48QhH PAZQmSntz+y+MP6B3z3ZBrxc2Fx0kCDtugn2P9+2RVUEpheANJ293vUgYTKN9Roy dHdWHFWNTO9IYpIN0cZjc25db4ULdjerWQrKcCr6ZO8+Ep/0mSzx3lkWjfuUP8Hr L2RD6rAm259OpPq8xhAcJpJvoQLwGxaBHyr4QYUmRgmNVURoqe9Q0MTZuiyGsXhm rtcNky9WvmyyI1lJgXi4A+vmsIThCHEstEMycgTejfJ4itIVA9e1ctJVVomWULCn 9oNStBJpmHw0myDCohbKNjeO1UX/erdF9NaoGto5bnfIhcSae1YQEjRB8zKmzbg1 DpgC1f7IZ7q53vfrDGsAjInOcuEwAn/Y5JMLJOL4mdA9j3XlX2o= =Ot99 -----END PGP SIGNATURE----- Merge tag 'v5.4.67' into 5.4-2.2.x-imx This is the 5.4.67 stable release This updates the kernel present in the NXP release imx_5.4.47_2.2.0 to the latest patchset available from stable korg. Base stable kernel version present in the NXP BSP release is v5.4.47. Following conflicts were recorded and resolved: - arch/arm/mach-imx/pm-imx6.c NXP version has a different PM vectoring scheme, where the IRAM bottom half (8k) is used to store IRAM code and pm_info. Keep this version to be compatible with NXP PM implementation. - arch/arm64/boot/dts/freescale/imx8mm-evk.dts - arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts NXP patches kept to provide proper LDO setup: imx8mm-evk.dts: 975d8ab07267ded741c4c5d7500e524c85ab40d3 imx8mn-ddr4-evk.dts: e8e35fd0e759965809f3dca5979a908a09286198 - drivers/crypto/caam/caamalg.c Keep NXP version, as it already covers the functionality for the upstream patch [`d6bbd4eea2`] - drivers/gpu/drm/imx/dw_hdmi-imx.c - drivers/gpu/drm/imx/imx-ldb.c - drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c Port changes from upstream commit [`1a27987101`], which extends component lifetime by moving drm structures allocation/free from bind() to probe(). - drivers/gpu/drm/imx/imx-ldb.c Merge patch [`1752ab50e8`] from upstream to disable both LVDS channels when Enoder is disabled - drivers/mmc/host/sdhci-esdhc-imx.c Fix merge fuzz produced by [`6534c897fd`]. - drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c Commit `d1a00c9bb1` from upstream solves the issue with improper error reporting when qdisc type support is absent. Upstream version is merged into NXP implementation. - drivers/net/ethernet/freescale/enetc/enetc.c Commit [`ce06fcb6a6`] from upstream merged, base NXP version kept - drivers/net/ethernet/freescale/enetc/enetc_pf.c Commit [`e8b86b4d87`] from upstream solves the kernel panic in case if probing fails. NXP has a clean-up logic implemented different, where the MDIO remove would be invoked in any failure case. Keep the NXP logic in place. - drivers/thermal/imx_thermal.c Upstream patch [`9025a5589c`] adds missing of_node_put call, NXP version has been adapted to accommodate this patch into the code. - drivers/usb/cdns3/ep0.c Manual merge of commit [`be8df02707`] from upstream to protect cdns3_check_new_setup - drivers/xen/swiotlb-xen.c Port upstream commit `cca58a1669` to NXP tree, manual hunk was resolved during merge. - sound/soc/fsl/fsl_esai.c Commit [`53057bd4ac`] upstream addresses the problem of endless isr in case if exception interrupt is enabled and tasklet is scheduled. Since NXP implementation has tasklet removed with commit [`2bbe95fe6c`], upstream fix does not match the main implementation, hence we keep the NXP version here. - sound/soc/fsl/fsl_sai.c Apply patch [`b8ae2bf5cc`] from upstream, which uses FIFO watermark mask macro. Signed-off-by: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>	2020-09-26 20:54:42 +00:00
Omar Sandoval	7f07bbf9bc	block: only call sched requeue_request() for scheduled requests [ Upstream commit `e8a8a18505` ] Yang Yang reported the following crash caused by requeueing a flush request in Kyber: [ 2.517297] Unable to handle kernel paging request at virtual address ffffffd8071c0b00 ... [ 2.517468] pc : clear_bit+0x18/0x2c [ 2.517502] lr : sbitmap_queue_clear+0x40/0x228 [ 2.517503] sp : ffffff800832bc60 pstate : 00c00145 ... [ 2.517599] Process ksoftirqd/5 (pid: 51, stack limit = 0xffffff8008328000) [ 2.517602] Call trace: [ 2.517606] clear_bit+0x18/0x2c [ 2.517619] kyber_finish_request+0x74/0x80 [ 2.517627] blk_mq_requeue_request+0x3c/0xc0 [ 2.517637] __scsi_queue_insert+0x11c/0x148 [ 2.517640] scsi_softirq_done+0x114/0x130 [ 2.517643] blk_done_softirq+0x7c/0xb0 [ 2.517651] __do_softirq+0x208/0x3bc [ 2.517657] run_ksoftirqd+0x34/0x60 [ 2.517663] smpboot_thread_fn+0x1c4/0x2c0 [ 2.517667] kthread+0x110/0x120 [ 2.517669] ret_from_fork+0x10/0x18 This happens because Kyber doesn't track flush requests, so kyber_finish_request() reads a garbage domain token. Only call the scheduler's requeue_request() hook if RQF_ELVPRIV is set (like we do for the finish_request() hook in blk_mq_free_request()). Now that we're handling it in blk-mq, also remove the check from BFQ. Reported-by: Yang Yang <yang.yang@vivo.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-23 12:40:37 +02:00
Ritesh Harjani	6736317f35	block: Set same_page to false in __bio_try_merge_page if ret is false [ Upstream commit `2cd896a5e8` ] If we hit the UINT_MAX limit of bio->bi_iter.bi_size and so we are anyway not merging this page in this bio, then it make sense to make same_page also as false before returning. Without this patch, we hit below WARNING in iomap. This mostly happens with very large memory system and / or after tweaking vm dirty threshold params to delay writeback of dirty data. WARNING: CPU: 18 PID: 5130 at fs/iomap/buffered-io.c:74 iomap_page_release+0x120/0x150 CPU: 18 PID: 5130 Comm: fio Kdump: loaded Tainted: G W 5.8.0-rc3 #6 Call Trace: __remove_mapping+0x154/0x320 (unreliable) iomap_releasepage+0x80/0x180 try_to_release_page+0x94/0xe0 invalidate_inode_page+0xc8/0x110 invalidate_mapping_pages+0x1dc/0x540 generic_fadvise+0x3c8/0x450 xfs_file_fadvise+0x2c/0xe0 [xfs] vfs_fadvise+0x3c/0x60 ksys_fadvise64_64+0x68/0xe0 sys_fadvise64+0x28/0x40 system_call_exception+0xf8/0x1c0 system_call_common+0xf0/0x278 Fixes: `cc90bc6842` ("block: fix "check bi_size overflow before merge"") Reported-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-17 13:47:44 +02:00
Tejun Heo	9f4ab0172e	blk-iocost: ioc_pd_free() shouldn't assume irq disabled commit `5aeac7c4b1` upstream. ioc_pd_free() grabs irq-safe ioc->lock without ensuring that irq is disabled when it can be called with irq disabled or enabled. This has a small chance of causing A-A deadlocks and triggers lockdep splats. Use irqsave operations instead. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: `7caa47151a` ("blkcg: implement blk-iocost") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-09 19:12:35 +02:00
Jens Axboe	5f5f272281	block: ensure bdi->io_pages is always initialized commit `de1b0ee490` upstream. If a driver leaves the limit settings as the defaults, then we don't initialize bdi->io_pages. This means that file systems may need to work around bdi->io_pages == 0, which is somewhat messy. Initialize the default value just like we do for ->ra_pages. Cc: stable@vger.kernel.org Fixes: `9491ae4aad` ("mm: don't cap request size based on read-ahead setting") Reported-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-09 19:12:34 +02:00
Ming Lei	b1a83ee0cb	blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART commit `d7d8535f37` upstream. SCHED_RESTART code path is relied to re-run queue for dispatch requests in hctx->dispatch. Meantime the SCHED_RSTART flag is checked when adding requests to hctx->dispatch. memory barriers have to be used for ordering the following two pair of OPs: 1) adding requests to hctx->dispatch and checking SCHED_RESTART in blk_mq_dispatch_rq_list() 2) clearing SCHED_RESTART and checking if there is request in hctx->dispatch in blk_mq_sched_restart(). Without the added memory barrier, either: 1) blk_mq_sched_restart() may miss requests added to hctx->dispatch meantime blk_mq_dispatch_rq_list() observes SCHED_RESTART, and not run queue in dispatch side or 2) blk_mq_dispatch_rq_list still sees SCHED_RESTART, and not run queue in dispatch side, meantime checking if there is request in hctx->dispatch from blk_mq_sched_restart() is missed. IO hang in ltp/fs_fill test is reported by kernel test robot: https://lkml.org/lkml/2020/7/26/77 Turns out it is caused by the above out-of-order OPs. And the IO hang can't be observed any more after applying this patch. Fixes: `bd166ef183` ("blk-mq-sched: add framework for MQ capable IO schedulers") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: David Jeffery <djeffery@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-03 11:27:01 +02:00
Keith Busch	f09dbec9c0	block: fix get_max_io_size() commit `e4b469c66f` upstream. A previous commit aligning splits to physical block sizes inadvertently modified one return case such that that it now returns 0 length splits when the number of sectors doesn't exceed the physical offset. This later hits a BUG in bio_split(). Restore the previous working behavior. Fixes: `9cc5169cd4` ("block: Improve physical block alignment of split bios") Reported-by: Eric Deal <eric.deal@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Cc: Bart Van Assche <bvanassche@acm.org> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-03 11:27:01 +02:00
Yufen Yu	05c608f630	blkcg: fix memleak for iolatency [ Upstream commit `27029b4b18` ] Normally, blkcg_iolatency_exit() will free related memory in iolatency when cleanup queue. But if blk_throtl_init() return error and queue init fail, blkcg_iolatency_exit() will not do that for us. Then it cause memory leak. Fixes: `d706751215` ("block: introduce blk-iolatency io controller") Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-03 11:26:55 +02:00
Ming Lei	872a2b3182	blk-mq: insert request not through ->queue_rq into sw/scheduler queue [ Upstream commit `db03f88fae` ] `c616cbee97` ("blk-mq: punt failed direct issue to dispatch list") supposed to add request which has been through ->queue_rq() to the hw queue dispatch list, however it adds request running out of budget or driver tag to hw queue too. This way basically bypasses request merge, and causes too many request dispatched to LLD, and system% is unnecessary increased. Fixes this issue by adding request not through ->queue_rq into sw/scheduler queue, and this way is safe because no ->queue_rq is called on this request yet. High %system can be observed on Azure storvsc device, and even soft lock is observed. This patch reduces %system during heavy sequential IO, meantime decreases soft lockup risk. Fixes: `c616cbee97` ("blk-mq: punt failed direct issue to dispatch list") Signed-off-by: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-03 11:26:54 +02:00
Dmitry Monakhov	1475314530	bfq: fix blkio cgroup leakage v4 [ Upstream commit `2de791ab49` ] Changes from v1: - update commit description with proper ref-accounting justification commit `db37a34c56` ("block, bfq: get a ref to a group when adding it to a service tree") introduce leak forbfq_group and blkcg_gq objects because of get/put imbalance. In fact whole idea of original commit is wrong because bfq_group entity can not dissapear under us because it is referenced by child bfq_queue's entities from here: -> bfq_init_entity() ->bfqg_and_blkg_get(bfqg); ->entity->parent = bfqg->my_entity -> bfq_put_queue(bfqq) FINAL_PUT ->bfqg_and_blkg_put(bfqq_group(bfqq)) ->kmem_cache_free(bfq_pool, bfqq); So parent entity can not disappear while child entity is in tree, and child entities already has proper protection. This patch revert commit `db37a34c56` ("block, bfq: get a ref to a group when adding it to a service tree") bfq_group leak trace caused by bad commit: -> blkg_alloc -> bfq_pq_alloc -> bfqg_get (+1) ->bfq_activate_bfqq ->bfq_activate_requeue_entity -> __bfq_activate_entity ->bfq_get_entity ->bfqg_and_blkg_get (+1) <==== : Note1 ->bfq_del_bfqq_busy ->bfq_deactivate_entity+0x53/0xc0 [bfq] ->__bfq_deactivate_entity+0x1b8/0x210 [bfq] -> bfq_forget_entity(is_in_service = true) entity->on_st_or_in_serv = false <=== :Note2 if (is_in_service) return; ==> do not touch reference -> blkcg_css_offline -> blkcg_destroy_blkgs -> blkg_destroy -> bfq_pd_offline -> __bfq_deactivate_entity if (!entity->on_st_or_in_serv) /* true, because (Note2) return false; -> bfq_pd_free -> bfqg_put() (-1, byt bfqg->ref == 2) because of (Note2) So bfq_group and blkcg_gq will leak forever, see test-case below. ##TESTCASE_BEGIN: #!/bin/bash max_iters=${1:-100} #prep cgroup mounts mount -t tmpfs cgroup_root /sys/fs/cgroup mkdir /sys/fs/cgroup/blkio mount -t cgroup -o blkio none /sys/fs/cgroup/blkio # Prepare blkdev grep blkio /proc/cgroups truncate -s 1M img losetup /dev/loop0 img echo bfq > /sys/block/loop0/queue/scheduler grep blkio /proc/cgroups for ((i=0;i<max_iters;i++)) do mkdir -p /sys/fs/cgroup/blkio/a echo 0 > /sys/fs/cgroup/blkio/a/cgroup.procs dd if=/dev/loop0 bs=4k count=1 of=/dev/null iflag=direct 2> /dev/null echo 0 > /sys/fs/cgroup/blkio/cgroup.procs rmdir /sys/fs/cgroup/blkio/a grep blkio /proc/cgroups done ##TESTCASE_END: Fixes: `db37a34c56` ("block, bfq: get a ref to a group when adding it to a service tree") Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-03 11:26:54 +02:00
Matthew Wilcox (Oracle)	2295664518	block: Fix page_is_mergeable() for compound pages [ Upstream commit `d81665198b` ] If we pass in an offset which is larger than PAGE_SIZE, then page_is_mergeable() thinks it's not mergeable with the previous bio_vec, leading to a large number of bio_vecs being used. Use a slightly more obvious test that the two pages are compatible with each other. Fixes: `52d52d1c98` ("block: only allow contiguous page structs in a bio_vec") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-03 11:26:54 +02:00
Ming Lei	cc3a73f245	block: respect queue limit of max discard segment [ Upstream commit `943b40c832` ] When queue_max_discard_segments(q) is 1, blk_discard_mergable() will return false for discard request, then normal request merge is applied. However, only queue_max_segments() is checked, so max discard segment limit isn't respected. Check max discard segment limit in the request merge code for fixing the issue. Discard request failure of virtio_blk is fixed. Fixes: `6984046608` ("block: fix the DISCARD request merge") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-03 11:26:54 +02:00
Chengming Zhou	2f53a4b54e	iocost: Fix check condition of iocg abs_vdebt [ Upstream commit `d9012a59db` ] We shouldn't skip iocg when its abs_vdebt is not zero. Fixes: `0b80f9866e` ("iocost: protect iocg->abs_vdebt with iocg->waitq.lock") Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-08-19 08:15:58 +02:00
Ming Lei	d2ccad3c9c	block: fix get_max_segment_size() overflow on 32bit arch commit `4a2f704eb2` upstream. Commit `429120f3df` starts to take account of segment's start dma address when computing max segment size, and data type of 'unsigned long' is used to do that. However, the segment mask may be 0xffffffff, so the figured out segment size may be overflowed in case of zero physical address on 32bit arch. Fix the issue by returning queue_max_segment_size() directly when that happens. Fixes: `429120f3df` ("block: fix splitting segments on boundary masks") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Christoph Hellwig <hch@lst.de> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-22 09:33:17 +02:00
Ming Lei	310d75f274	block: fix splitting segments on boundary masks commit `429120f3df` upstream. We ran into a problem with a mpt3sas based controller, where we would see random (and hard to reproduce) file corruption). The issue seemed specific to this controller, but wasn't specific to the file system. After a lot of debugging, we find out that it's caused by segments spanning a 4G memory boundary. This shouldn't happen, as the default setting for segment boundary masks is 4G. Turns out there are two issues in get_max_segment_size(): 1) The default segment boundary mask is bypassed 2) The segment start address isn't taken into account when checking segment boundary limit Fix these two issues by removing the bypass of the segment boundary check even if the mask is set to the default value, and taking into account the actual start address of the request when checking if a segment needs splitting. Cc: stable@vger.kernel.org # v5.1+ Reviewed-by: Chris Mason <clm@fb.com> Tested-by: Chris Mason <clm@fb.com> Fixes: `dcebd75592` ("block: use bio_for_each_bvec() to compute multi-page bvec count") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Dropped const on the page pointer, ppc page_to_phys() doesn't mark the page as const... Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-07-22 09:33:17 +02:00
Hou Tao	c3adbd37c0	blk-mq-debugfs: update blk_queue_flag_name[] accordingly for new flags [ Upstream commit `bfe373f608` ] Else there may be magic numbers in /sys/kernel/debug/block/*/state. Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-07-22 09:32:52 +02:00
Ming Lei	49a7ac29f6	blk-mq: consider non-idle request as "inflight" in blk_mq_rq_inflight() commit `05a4fed69f` upstream. dm-multipath is the only user of blk_mq_queue_inflight(). When dm-multipath calls blk_mq_queue_inflight() to check if it has outstanding IO it can get a false negative. The reason for this is blk_mq_rq_inflight() doesn't consider requests that are no longer MQ_RQ_IN_FLIGHT but that are now MQ_RQ_COMPLETE (->complete isn't called or finished yet) as "inflight". This causes request-based dm-multipath's dm_wait_for_completion() to return before all outstanding dm-multipath requests have actually completed. This breaks DM multipath's suspend functionality because blk-mq requests complete after DM's suspend has finished -- which shouldn't happen. Fix this by considering any request not in the MQ_RQ_IDLE state (so either MQ_RQ_COMPLETE or MQ_RQ_IN_FLIGHT) as "inflight" in blk_mq_rq_inflight(). Fixes: `3c94d83cb3` ("blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-16 08:16:47 +02:00
Chengguang Xu	074ae0cd84	block: release bip in a right way in error path [ Upstream commit `0b8eb629a7` ] Release bip using kfree() in error path when that was allocated by kmalloc(). Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-07-16 08:16:36 +02:00
Weiping Zhang	26b0956cb3	block: update hctx map when use multiple maps [ Upstream commit `fe35ec58f0` ] There is an issue when tune the number for read and write queues, if the total queue count was not changed. The hctx->type cannot be updated, since __blk_mq_update_nr_hw_queues will return directly if the total queue count has not been changed. Reproduce: dmesg \| grep "default/read/poll" [ 2.607459] nvme nvme0: 48/0/0 default/read/poll queues cat /sys/kernel/debug/block/nvme0n1/hctx/type \| sort \| uniq -c 48 default tune the write queues to 24: echo 24 > /sys/module/nvme/parameters/write_queues echo 1 > /sys/block/nvme0n1/device/reset_controller dmesg \| grep "default/read/poll" [ 433.547235] nvme nvme0: 24/24/0 default/read/poll queues cat /sys/kernel/debug/block/nvme0n1/hctx/type \| sort \| uniq -c 48 default The driver's hardware queue mapping is not same as block layer. Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-30 15:37:06 -04:00
yu kuai	b90ca32531	block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed commit `a75ca93031` upstream. commit `e7bf90e5af` ("block/bio-integrity: fix a memory leak bug") added a kfree() for 'buf' if bio_integrity_add_page() returns '0'. However, the object will be freed in bio_integrity_free() since 'bio->bi_opf' and 'bio->bi_integrity' were set previousy in bio_integrity_alloc(). Fixes: commit `e7bf90e5af` ("block/bio-integrity: fix a memory leak bug") Signed-off-by: yu kuai <yukuai3@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-06-30 15:36:43 -04:00
Tejun Heo	894d9cd524	iocost: don't let vrate run wild while there's no saturation signal [ Upstream commit `81ca627a93` ] When the QoS targets are met and nothing is being throttled, there's no way to tell how saturated the underlying device is - it could be almost entirely idle, at the cusp of saturation or anywhere inbetween. Given that there's no information, it's best to keep vrate as-is in this state. Before `7cd806a9a9` ("iocost: improve nr_lagging handling"), this was the case - if the device isn't missing QoS targets and nothing is being throttled, busy_level was reset to zero. While fixing nr_lagging handling, `7cd806a9a9` ("iocost: improve nr_lagging handling") broke this. Now, while the device is hitting QoS targets and nothing is being throttled, vrate keeps getting adjusted according to the existing busy_level. This led to vrate keeping climing till it hits max when there's an IO issuer with limited request concurrency if the vrate started low. vrate starts getting adjusted upwards until the issuer can issue IOs w/o being throttled. From then on, QoS targets keeps getting met and nothing on the system needs throttling and vrate keeps getting increased due to the existing busy_level. This patch makes the following changes to the busy_level logic. * Reset busy_level if nr_shortages is zero to avoid the above scenario. * Make non-zero nr_lagging block lowering nr_level but still clear positive busy_level if there's clear non-saturation signal - QoS targets are met and nr_shortages is non-zero. nr_lagging's role is preventing adjusting vrate upwards while there are long-running commands and it shouldn't keep busy_level positive while there's clear non-saturation signal. * Restructure code for clarity and add comments. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Andy Newell <newella@fb.com> Fixes: `7cd806a9a9` ("iocost: improve nr_lagging handling") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-22 09:31:06 +02:00
Weiping Zhang	e7aefaba39	block: reset mapping if failed to update hardware queue count [ Upstream commit `aa880ad690` ] When we increase hardware queue count, blk_mq_update_queue_map will reset the mapping between cpu and hardware queue base on the hardware queue count(set->nr_hw_queues). The mapping cannot be reset if it encounters error in blk_mq_realloc_hw_ctxs, but the fallback flow will continue using it, then blk_mq_map_swqueue will touch a invalid memory, because the mapping points to a wrong hctx. blktest block/030: null_blk: module loaded Increasing nr_hw_queues to 8 fails, fallback to 1 ================================================================== BUG: KASAN: null-ptr-deref in blk_mq_map_swqueue+0x2f2/0x830 Read of size 8 at addr 0000000000000128 by task nproc/8541 CPU: 5 PID: 8541 Comm: nproc Not tainted 5.7.0-rc4-dbg+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014 Call Trace: dump_stack+0xa5/0xe6 __kasan_report.cold+0x65/0xbb kasan_report+0x45/0x60 check_memory_region+0x15e/0x1c0 __kasan_check_read+0x15/0x20 blk_mq_map_swqueue+0x2f2/0x830 __blk_mq_update_nr_hw_queues+0x3df/0x690 blk_mq_update_nr_hw_queues+0x32/0x50 nullb_device_submit_queues_store+0xde/0x160 [null_blk] configfs_write_file+0x1c4/0x250 [configfs] __vfs_write+0x4c/0x90 vfs_write+0x14b/0x2d0 ksys_write+0xdd/0x180 __x64_sys_write+0x47/0x50 do_syscall_64+0x6f/0x310 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com> Tested-by: Bart van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-22 09:30:53 +02:00
Ming Lei	201219691a	block: alloc map and request for new hardware queue [ Upstream commit `fd689871bb` ] Alloc new map and request for new hardware queue when increse hardware queue count. Before this patch, it will show a warning for each new hardware queue, but it's not enough, these hctx have no maps and reqeust, when a bio was mapped to these hardware queue, it will trigger kernel panic when get request from these hctx. Test environment: * A NVMe disk supports 128 io queues * 96 cpus in system A corner case can always trigger this panic, there are 96 io queues allocated for HCTX_TYPE_DEFAULT type, the corresponding kernel log: nvme nvme0: 96/0/0 default/read/poll queues. Now we set nvme write queues to 96, then nvme will alloc others(32) queues for read, but blk_mq_update_nr_hw_queues does not alloc map and request for these new added io queues. So when process read nvme disk, it will trigger kernel panic when get request from these hardware context. Reproduce script: nr=$(expr `cat /sys/block/nvme0n1/device/queue_count` - 1) echo $nr > /sys/module/nvme/parameters/write_queues echo 1 > /sys/block/nvme0n1/device/reset_controller dd if=/dev/nvme0n1 of=/dev/null bs=4K count=1 [ 8040.805626] ------------[ cut here ]------------ [ 8040.805627] WARNING: CPU: 82 PID: 12921 at block/blk-mq.c:2578 blk_mq_map_swqueue+0x2b6/0x2c0 [ 8040.805627] Modules linked in: nvme nvme_core nf_conntrack_netlink xt_addrtype br_netfilter overlay xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_counter nf_nat_tftp nf_conntrack_tftp nft_masq nf_tables_set nft_fib_inet nft_f ib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack tun bridge nf_defrag_ipv6 nf_defrag_ipv4 stp llc ip6_tables ip_tables nft_compat rfkill ip_set nf_tables nfne tlink sunrpc intel_rapl_msr intel_rapl_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_ssif crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support ghash_clmulni_intel intel_ cstate intel_uncore raid0 joydev intel_rapl_perf ipmi_si pcspkr mei_me ioatdma sg ipmi_devintf mei i2c_i801 dca lpc_ich ipmi_msghandler acpi_power_meter acpi_pad xfs libcrc32c sd_mod ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm d rm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops [ 8040.805637] ahci drm i40e libahci crc32c_intel libata t10_pi wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: nvme_core] [ 8040.805640] CPU: 82 PID: 12921 Comm: kworker/u194:2 Kdump: loaded Tainted: G W 5.6.0-rc5.78317c+ #2 [ 8040.805640] Hardware name: Inspur SA5212M5/YZMB-00882-104, BIOS 4.0.9 08/27/2019 [ 8040.805641] Workqueue: nvme-reset-wq nvme_reset_work [nvme] [ 8040.805642] RIP: 0010:blk_mq_map_swqueue+0x2b6/0x2c0 [ 8040.805643] Code: 00 00 00 00 00 41 83 c5 01 44 39 6d 50 77 b8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b bb 98 00 00 00 89 d6 e8 8c 81 03 00 eb 83 <0f> 0b e9 52 ff ff ff 0f 1f 00 0f 1f 44 00 00 41 57 48 89 f1 41 56 [ 8040.805643] RSP: 0018:ffffba590d2e7d48 EFLAGS: 00010246 [ 8040.805643] RAX: 0000000000000000 RBX: ffff9f013e1ba800 RCX: 000000000000003d [ 8040.805644] RDX: ffff9f00ffff6000 RSI: 0000000000000003 RDI: ffff9ed200246d90 [ 8040.805644] RBP: ffff9f00f6a79860 R08: 0000000000000000 R09: 000000000000003d [ 8040.805645] R10: 0000000000000001 R11: ffff9f0138c3d000 R12: ffff9f00fb3a9008 [ 8040.805645] R13: 000000000000007f R14: ffffffff96822660 R15: 000000000000005f [ 8040.805645] FS: 0000000000000000(0000) GS:ffff9f013fa80000(0000) knlGS:0000000000000000 [ 8040.805646] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8040.805646] CR2: 00007f7f397fa6f8 CR3: 0000003d8240a002 CR4: 00000000007606e0 [ 8040.805647] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8040.805647] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8040.805647] PKRU: 55555554 [ 8040.805647] Call Trace: [ 8040.805649] blk_mq_update_nr_hw_queues+0x31b/0x390 [ 8040.805650] nvme_reset_work+0xb4b/0xeab [nvme] [ 8040.805651] process_one_work+0x1a7/0x370 [ 8040.805652] worker_thread+0x1c9/0x380 [ 8040.805653] ? max_active_store+0x80/0x80 [ 8040.805655] kthread+0x112/0x130 [ 8040.805656] ? __kthread_parkme+0x70/0x70 [ 8040.805657] ret_from_fork+0x35/0x40 [ 8040.805658] ---[ end trace b5f13b1e73ccb5d3 ]--- [ 8229.365135] BUG: kernel NULL pointer dereference, address: 0000000000000004 [ 8229.365165] #PF: supervisor read access in kernel mode [ 8229.365178] #PF: error_code(0x0000) - not-present page [ 8229.365191] PGD 0 P4D 0 [ 8229.365201] Oops: 0000 [#1] SMP PTI [ 8229.365212] CPU: 77 PID: 13024 Comm: dd Kdump: loaded Tainted: G W 5.6.0-rc5.78317c+ #2 [ 8229.365232] Hardware name: Inspur SA5212M5/YZMB-00882-104, BIOS 4.0.9 08/27/2019 [ 8229.365253] RIP: 0010:blk_mq_get_tag+0x227/0x250 [ 8229.365265] Code: 44 24 04 44 01 e0 48 8b 74 24 38 65 48 33 34 25 28 00 00 00 75 33 48 83 c4 40 5b 5d 41 5c 41 5d 41 5e c3 48 8d 68 10 4c 89 ef <44> 8b 60 04 48 89 ee e8 dd f9 ff ff 83 f8 ff 75 c8 e9 67 fe ff ff [ 8229.365304] RSP: 0018:ffffba590e977970 EFLAGS: 00010246 [ 8229.365317] RAX: 0000000000000000 RBX: ffff9f00f6a79860 RCX: ffffba590e977998 [ 8229.365333] RDX: 0000000000000000 RSI: ffff9f012039b140 RDI: ffffba590e977a38 [ 8229.365349] RBP: 0000000000000010 R08: ffffda58ff94e190 R09: ffffda58ff94e198 [ 8229.365365] R10: 0000000000000011 R11: ffff9f00f6a79860 R12: 0000000000000000 [ 8229.365381] R13: ffffba590e977a38 R14: ffff9f012039b140 R15: 0000000000000001 [ 8229.365397] FS: 00007f481c230580(0000) GS:ffff9f013f940000(0000) knlGS:0000000000000000 [ 8229.365415] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8229.365428] CR2: 0000000000000004 CR3: 0000005f35e26004 CR4: 00000000007606e0 [ 8229.365444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8229.365460] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8229.365476] PKRU: 55555554 [ 8229.365484] Call Trace: [ 8229.365498] ? finish_wait+0x80/0x80 [ 8229.365512] blk_mq_get_request+0xcb/0x3f0 [ 8229.365525] blk_mq_make_request+0x143/0x5d0 [ 8229.365538] generic_make_request+0xcf/0x310 [ 8229.365553] ? scan_shadow_nodes+0x30/0x30 [ 8229.365564] submit_bio+0x3c/0x150 [ 8229.365576] mpage_readpages+0x163/0x1a0 [ 8229.365588] ? blkdev_direct_IO+0x490/0x490 [ 8229.365601] read_pages+0x6b/0x190 [ 8229.365612] __do_page_cache_readahead+0x1c1/0x1e0 [ 8229.365626] ondemand_readahead+0x182/0x2f0 [ 8229.365639] generic_file_buffered_read+0x590/0xab0 [ 8229.365655] new_sync_read+0x12a/0x1c0 [ 8229.365666] vfs_read+0x8a/0x140 [ 8229.365676] ksys_read+0x59/0xd0 [ 8229.365688] do_syscall_64+0x55/0x1d0 [ 8229.365700] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com> Tested-by: Weiping Zhang <zhangweiping@didiglobal.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-22 09:30:53 +02:00
Jason Liu	5691e22711	Merge tag 'v5.4.47' into imx_5.4.y * tag 'v5.4.47': (2193 commits) Linux 5.4.47 KVM: arm64: Save the host's PtrAuth keys in non-preemptible context KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception ... Conflicts: arch/arm/boot/dts/imx6qdl.dtsi arch/arm/mach-imx/Kconfig arch/arm/mach-imx/common.h arch/arm/mach-imx/suspend-imx6.S arch/arm64/boot/dts/freescale/imx8qxp-mek.dts arch/powerpc/include/asm/cacheflush.h drivers/cpufreq/imx6q-cpufreq.c drivers/dma/imx-sdma.c drivers/edac/synopsys_edac.c drivers/firmware/imx/imx-scu.c drivers/net/ethernet/freescale/fec.h drivers/net/ethernet/freescale/fec_main.c drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c drivers/net/phy/phy_device.c drivers/perf/fsl_imx8_ddr_perf.c drivers/usb/cdns3/gadget.c drivers/usb/dwc3/gadget.c include/uapi/linux/dma-buf.h Signed-off-by: Jason Liu <jason.hui.liu@nxp.com>	2020-06-19 17:32:49 +08:00
Jens Axboe	bba91cdba6	Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT" [ Upstream commit `b0beb28097` ] This reverts commit `c58c1f8343`. io_uring does do the right thing for this case, and we're still returning -EAGAIN to userspace for the cases we don't support. Revert this change to avoid doing endless spins of resubmits. Cc: stable@vger.kernel.org # v5.6 Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-03 08:21:27 +02:00
Tejun Heo	34ca080088	iocost: protect iocg->abs_vdebt with iocg->waitq.lock commit `0b80f9866e` upstream. abs_vdebt is an atomic_64 which tracks how much over budget a given cgroup is and controls the activation of use_delay mechanism. Once a cgroup goes over budget from forced IOs, it has to pay it back with its future budget. The progress guarantee on debt paying comes from the iocg being active - active iocgs are processed by the periodic timer, which ensures that as time passes the debts dissipate and the iocg returns to normal operation. However, both iocg activation and vdebt handling are asynchronous and a sequence like the following may happen. 1. The iocg is in the process of being deactivated by the periodic timer. 2. A bio enters ioc_rqos_throttle(), calls iocg_activate() which returns without anything because it still sees that the iocg is already active. 3. The iocg is deactivated. 4. The bio from #2 is over budget but needs to be forced. It increases abs_vdebt and goes over the threshold and enables use_delay. 5. IO control is enabled for the iocg's subtree and now IOs are attributed to the descendant cgroups and the iocg itself no longer issues IOs. This leaves the iocg with stuck abs_vdebt - it has debt but inactive and no further IOs which can activate it. This can end up unduly punishing all the descendants cgroups. The usual throttling path has the same issue - the iocg must be active while throttled to ensure that future event will wake it up - and solves the problem by synchronizing the throttling path with a spinlock. abs_vdebt handling is another form of overage handling and shares a lot of characteristics including the fact that it isn't in the hottest path. This patch fixes the above and other possible races by strictly synchronizing abs_vdebt and use_delay handling with iocg->waitq.lock. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Vlad Dmitriev <vvd@fb.com> Cc: stable@vger.kernel.org # v5.4+ Fixes: `e1518f63f2` ("blk-iocost: Don't let merges push vtime into the future") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-14 07:58:27 +02:00
John Garry	c7b6c51298	blk-mq: Put driver tag in blk_mq_dispatch_rq_list() when no budget [ Upstream commit `5fe56de799` ] If in blk_mq_dispatch_rq_list() we find no budget, then we break of the dispatch loop, but the request may keep the driver tag, evaulated in 'nxt' in the previous loop iteration. Fix by putting the driver tag for that request. Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-05-02 08:48:59 +02:00
Waiman Long	9c5c94c501	blk-iocost: Fix error on iocost_ioc_vrate_adj commit `d6c8e949a3` upstream. Systemtap 4.2 is unable to correctly interpret the "u32 (missed_ppm)[2]" argument of the iocost_ioc_vrate_adj trace entry defined in include/trace/events/iocost.h leading to the following error: /tmp/stapAcz0G0/stap_c89c58b83cea1724e26395efa9ed4939_6321_aux_6.c:78:8: error: expected ‘;’, ‘,’ or ‘)’ before ‘’ token , u32[]* __tracepoint_arg_missed_ppm That argument type is indeed rather complex and hard to read. Looking at block/blk-iocost.c. It is just a 2-entry u32 array. By simplifying the argument to a simple "u32 *missed_ppm" and adjusting the trace entry accordingly, the compilation error was gone. Fixes: `7caa47151a` ("blkcg: implement blk-iocost") Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-02 08:48:53 +02:00
Paolo Valente	a362482b23	block, bfq: invoke flush_idle_tree after reparent_active_queues in pd_offline commit `4d38a87fbb` upstream. In bfq_pd_offline(), the function bfq_flush_idle_tree() is invoked to flush the rb tree that contains all idle entities belonging to the pd (cgroup) being destroyed. In particular, bfq_flush_idle_tree() is invoked before bfq_reparent_active_queues(). Yet the latter may happen to add some entities to the idle tree. It happens if, in some of the calls to bfq_bfqq_move() performed by bfq_reparent_active_queues(), the queue to move is empty and gets expired. This commit simply reverses the invocation order between bfq_flush_idle_tree() and bfq_reparent_active_queues(). Tested-by: cki-project@redhat.com Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-23 10:36:26 +02:00
Paolo Valente	839b7cd1d8	block, bfq: make reparent_leaf_entity actually work only on leaf entities commit `576682fa52` upstream. bfq_reparent_leaf_entity() reparents the input leaf entity (a leaf entity represents just a bfq_queue in an entity tree). Yet, the input entity is guaranteed to always be a leaf entity only in two-level entity trees. In this respect, because of the error fixed by commit `14afc59361` ("block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group()"), all (wrongly collapsed) entity trees happened to actually have only two levels. After the latter commit, this does not hold any longer. This commit fixes this problem by modifying bfq_reparent_leaf_entity(), so that it searches an active leaf entity down the path that stems from the input entity. Such a leaf entity is guaranteed to exist when bfq_reparent_leaf_entity() is invoked. Tested-by: cki-project@redhat.com Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-23 10:36:26 +02:00
Paolo Valente	ad749ca022	block, bfq: turn put_queue into release_process_ref in __bfq_bic_change_cgroup commit `c899773665` upstream. A bfq_put_queue() may be invoked in __bfq_bic_change_cgroup(). The goal of this put is to release a process reference to a bfq_queue. But process-reference releases may trigger also some extra operation, and, to this goal, are handled through bfq_release_process_ref(). So, turn the invocation of bfq_put_queue() into an invocation of bfq_release_process_ref(). Tested-by: cki-project@redhat.com Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-23 10:36:26 +02:00
Zhiqiang Liu	b37de1b1e8	block, bfq: fix use-after-free in bfq_idle_slice_timer_body [ Upstream commit `2f95fa5c95` ] In bfq_idle_slice_timer func, bfqq = bfqd->in_service_queue is not in bfqd-lock critical section. The bfqq, which is not equal to NULL in bfq_idle_slice_timer, may be freed after passing to bfq_idle_slice_timer_body. So we will access the freed memory. In addition, considering the bfqq may be in race, we should firstly check whether bfqq is in service before doing something on it in bfq_idle_slice_timer_body func. If the bfqq in race is not in service, it means the bfqq has been expired through __bfq_bfqq_expire func, and wait_request flags has been cleared in __bfq_bfqd_reset_in_service func. So we do not need to re-clear the wait_request of bfqq which is not in service. KASAN log is given as follows: [13058.354613] ================================================================== [13058.354640] BUG: KASAN: use-after-free in bfq_idle_slice_timer+0xac/0x290 [13058.354644] Read of size 8 at addr ffffa02cf3e63f78 by task fork13/19767 [13058.354646] [13058.354655] CPU: 96 PID: 19767 Comm: fork13 [13058.354661] Call trace: [13058.354667] dump_backtrace+0x0/0x310 [13058.354672] show_stack+0x28/0x38 [13058.354681] dump_stack+0xd8/0x108 [13058.354687] print_address_description+0x68/0x2d0 [13058.354690] kasan_report+0x124/0x2e0 [13058.354697] __asan_load8+0x88/0xb0 [13058.354702] bfq_idle_slice_timer+0xac/0x290 [13058.354707] __hrtimer_run_queues+0x298/0x8b8 [13058.354710] hrtimer_interrupt+0x1b8/0x678 [13058.354716] arch_timer_handler_phys+0x4c/0x78 [13058.354722] handle_percpu_devid_irq+0xf0/0x558 [13058.354731] generic_handle_irq+0x50/0x70 [13058.354735] __handle_domain_irq+0x94/0x110 [13058.354739] gic_handle_irq+0x8c/0x1b0 [13058.354742] el1_irq+0xb8/0x140 [13058.354748] do_wp_page+0x260/0xe28 [13058.354752] __handle_mm_fault+0x8ec/0x9b0 [13058.354756] handle_mm_fault+0x280/0x460 [13058.354762] do_page_fault+0x3ec/0x890 [13058.354765] do_mem_abort+0xc0/0x1b0 [13058.354768] el0_da+0x24/0x28 [13058.354770] [13058.354773] Allocated by task 19731: [13058.354780] kasan_kmalloc+0xe0/0x190 [13058.354784] kasan_slab_alloc+0x14/0x20 [13058.354788] kmem_cache_alloc_node+0x130/0x440 [13058.354793] bfq_get_queue+0x138/0x858 [13058.354797] bfq_get_bfqq_handle_split+0xd4/0x328 [13058.354801] bfq_init_rq+0x1f4/0x1180 [13058.354806] bfq_insert_requests+0x264/0x1c98 [13058.354811] blk_mq_sched_insert_requests+0x1c4/0x488 [13058.354818] blk_mq_flush_plug_list+0x2d4/0x6e0 [13058.354826] blk_flush_plug_list+0x230/0x548 [13058.354830] blk_finish_plug+0x60/0x80 [13058.354838] read_pages+0xec/0x2c0 [13058.354842] __do_page_cache_readahead+0x374/0x438 [13058.354846] ondemand_readahead+0x24c/0x6b0 [13058.354851] page_cache_sync_readahead+0x17c/0x2f8 [13058.354858] generic_file_buffered_read+0x588/0xc58 [13058.354862] generic_file_read_iter+0x1b4/0x278 [13058.354965] ext4_file_read_iter+0xa8/0x1d8 [ext4] [13058.354972] __vfs_read+0x238/0x320 [13058.354976] vfs_read+0xbc/0x1c0 [13058.354980] ksys_read+0xdc/0x1b8 [13058.354984] __arm64_sys_read+0x50/0x60 [13058.354990] el0_svc_common+0xb4/0x1d8 [13058.354994] el0_svc_handler+0x50/0xa8 [13058.354998] el0_svc+0x8/0xc [13058.354999] [13058.355001] Freed by task 19731: [13058.355007] __kasan_slab_free+0x120/0x228 [13058.355010] kasan_slab_free+0x10/0x18 [13058.355014] kmem_cache_free+0x288/0x3f0 [13058.355018] bfq_put_queue+0x134/0x208 [13058.355022] bfq_exit_icq_bfqq+0x164/0x348 [13058.355026] bfq_exit_icq+0x28/0x40 [13058.355030] ioc_exit_icq+0xa0/0x150 [13058.355035] put_io_context_active+0x250/0x438 [13058.355038] exit_io_context+0xd0/0x138 [13058.355045] do_exit+0x734/0xc58 [13058.355050] do_group_exit+0x78/0x220 [13058.355054] __wake_up_parent+0x0/0x50 [13058.355058] el0_svc_common+0xb4/0x1d8 [13058.355062] el0_svc_handler+0x50/0xa8 [13058.355066] el0_svc+0x8/0xc [13058.355067] [13058.355071] The buggy address belongs to the object at ffffa02cf3e63e70#012 which belongs to the cache bfq_queue of size 464 [13058.355075] The buggy address is located 264 bytes inside of#012 464-byte region [ffffa02cf3e63e70, ffffa02cf3e64040) [13058.355077] The buggy address belongs to the page: [13058.355083] page:ffff7e80b3cf9800 count:1 mapcount:0 mapping:ffff802db5c90780 index:0xffffa02cf3e606f0 compound_mapcount: 0 [13058.366175] flags: 0x2ffffe0000008100(slab\|head) [13058.370781] raw: 2ffffe0000008100 ffff7e80b53b1408 ffffa02d730c1c90 ffff802db5c90780 [13058.370787] raw: ffffa02cf3e606f0 0000000000370023 00000001ffffffff 0000000000000000 [13058.370789] page dumped because: kasan: bad access detected [13058.370791] [13058.370792] Memory state around the buggy address: [13058.370797] ffffa02cf3e63e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb [13058.370801] ffffa02cf3e63e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [13058.370805] >ffffa02cf3e63f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [13058.370808] ^ [13058.370811] ffffa02cf3e63f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [13058.370815] ffffa02cf3e64000: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [13058.370817] ================================================================== [13058.370820] Disabling lock debugging due to kernel taint Here, we directly pass the bfqd to bfq_idle_slice_timer_body func. -- V2->V3: rewrite the comment as suggested by Paolo Valente V1->V2: add one comment, and add Fixes and Reported-by tag. Fixes: `aee69d78d` ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler") Acked-by: Paolo Valente <paolo.valente@linaro.org> Reported-by: Wang Wang <wangwang2@huawei.com> Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Signed-off-by: Feilong Lin <linfeilong@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:05 +02:00

1 2 3 4 5 ...

4713 Commits (redonkable)