* Redfish requests for dbus-sensors data are needlessly slow - analysis @ 2021-07-29 9:46 Paul Fertser 2021-07-29 14:53 ` Ed Tanous 0 siblings, 1 reply; 9+ messages in thread From: Paul Fertser @ 2021-07-29 9:46 UTC (permalink / raw) To: openbmc, Ed Tanous, Jason Bills, Gunnar Mills, Jae Hyun Yoo, Vernon Mauery, Zhikui Ren Cc: Timofei Nikitin, Andrei Tutolmin Hello, Short version: Our operations team complains about very high (seconds) latency of requesting sensor data used to monitor health of the servers via Redfish. Boards are OCP Tioga Pass with AST2500 BMC. Software involved: OpenBMC bmcweb, dbus-sensors, ObjectMapper, EntityManager. In this mail I provide details about the testing methodology, the complete call chain (including RPC) involved in handling Redfish requests, and a quick-and-dirty patch that caches some D-Bus replies in bmcweb that results in cutting down response times by more than 4 times. It also gives significantly less CPU overhead. My impression so far is that the current way is rather wasteful and can be considerably improved. Our question is basically whether what we see is normal, if we should change the way servers are monitored, if bmcweb should really be better optimised and if yes, how exactly? Longer version: Our operations team is normally using Zabbix for monitoring, and they developed a set of scripts and templates to handle OpenBMC machines: https://github.com/Gagar-in/redfish-bmc-lld . For easier reproduction and analysis I instead use plain curl command to request data for a single specific sensor. The following text assumes we're interested to get output current of the first DRAM VR (handled by pxe1610 kernel driver and PSUSensor OpenBMC daemon which caches the values periodically read from hwmon nodes). To make sure I'm measuring just the time it takes to answer the GET request itself (and not counting any TLS session establishment overhead as curl is reusing an open connection) I'm running this command: $ curl -w '\n\nTime spent: %{time_total}\n\n\n' -k $(for i in $(seq 5); do echo https://root:0penBmc@$BMCIP/redfish/v1/Chassis/TiogaPass_Baseboard/Sensors/CPU0_VDDQ_A_IOUT; done) I typically get over 2 seconds for each request made. With the patch at the end of the mail the same test consistently shows times below half a second. The major part of the delay comes from numerous (and probably suboptimal) D-Bus calls, plus the overhead to process the responses. Here follows D-Bus profiling data typical for each Redfish sensors request made. "mc" is the method call performed by "bmcweb", "mr" is the response. I've added the second column (by processing dbus-monitor --profile output through awk -vOFS='\t' '{ d=$2-prev; prev=$2; $2=d"\t"$2; print $0 }' ) to show the difference between the current timestamp and the previous one so it's visible how much time it takes for the daemons to answer each request and how much time bmcweb needs to process the result. mc --------- 1625473150.291539 1665 :1.79 xyz.openbmc_project.User.Manager /xyz/openbmc_project/user xyz.openbmc_project.User.Manager GetUserInfo mr 0.0463991 1625473150.337938 469 :1.36 :1.79 1665 mc 0.00556493 1625473150.343503 1666 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree err 0.0143089 1625473150.357812 470 :1.36 :1.79 1665 mr 0.0478861 1625473150.405698 5891 :1.49 :1.79 1666 mc 0.0611379 1625473150.466836 1667 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree mr 0.0716419 1625473150.538478 5893 :1.49 :1.79 1667 mc 0.0977321 1625473150.636210 1668 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree mr 0.0268672 1625473150.663077 5895 :1.49 :1.79 1668 mc 0.0232749 1625473150.686352 1669 :1.79 xyz.openbmc_project.ObjectMapper / org.freedesktop.DBus.ObjectManager GetManagedObjects mr 0.156963 1625473150.843315 5896 :1.49 :1.79 1669 mc 0.428404 1625473151.271719 1670 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree mr 0.030153 1625473151.301872 5898 :1.49 :1.79 1670 mc 0.123439 1625473151.425311 1671 :1.79 xyz.openbmc_project.EntityManager / org.freedesktop.DBus.ObjectManager GetManagedObjects mr 0.284928 1625473151.710239 1408 :1.63 :1.79 1671 mc 0.108877 1625473151.819116 1672 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree mr 0.0525708 1625473151.871687 5900 :1.49 :1.79 1672 mc 0.0175102 1625473151.889197 1673 :1.79 xyz.openbmc_project.PSUSensor / org.freedesktop.DBus.ObjectManager GetManagedObjects mr 0.119283 1625473152.008480 124202 :1.104 :1.79 1673 To provide more insight into what and why bmcweb is doing I made the following call chain description by carefully reading redfish-core/lib/sensors.hpp source code: 0. requestRoutesSensor() is called to process the GET request. For further processing a D-Bus call is made. For experimenting on a running system I'm providing copy-pasteable "busctl" invocations for all the calls involved. # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias /xyz/openbmc_project/sensors 2 1 xyz.openbmc_project.Sensor.Value The returned array is searched for the requested sensor name, a flat_set is created with a single string containing D-Bus sensor path (in this example "/xyz/openbmc_project/sensors/current/CPU0_VDDQ_A_IOUT"). The flat_set is passed to processSensorList(). 1. processSensorList() calls getConnections() which calls getObjectsWithConnection() (the latter can also be called from setSensorsOverride()). It does this call (exactly the same as previous!): # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias /xyz/openbmc_project/sensors 2 1 xyz.openbmc_project.Sensor.Value For each returned subtree it determines if it's one of the sensors we requested. In this case the bus names are added to a "connections" flat_set (in this example it's just "xyz.openbmc_project.PSUSensor") and a set of tuples "objectsWithConnection" is created (in this example it's ("/xyz/openbmc_project/sensors/current/CPU0_VDDQ_A_IOUT","xyz.openbmc_project.PSUSensor") . "connections" is passed to the callback provided by processSensorList(), "objectsWithConnection" is ignored. 2. getConnectionCb() defined in processSensorList() calls getObjectManagerPaths(). The following call is made: # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias / 0 1 org.freedesktop.DBus.ObjectManager The result is converted to a "objectMgrPaths" associative array, with connection used as a key and path as a value (in this example the essential element would be objectMgrPaths["xyz.openbmc_project.PSUSensor"]="/" and objectMgrPaths["xyz.openbmc_project.ObjectMapper"]="/"). This map is passed to getObjectManagerPathsCb() defined inside processSensorList() 3. getInventoryItems() calls getInventoryItemAssociations() which uses the objectMgrPaths map to look for the path for "xyz.openbmc_project.ObjectMapper". If it's not found, "/" is assumed. The D-Bus call is made: # busctl --json=pretty call xyz.openbmc_project.ObjectMapper / org.freedesktop.DBus.ObjectManager GetManagedObjects For each sensor path we're interested in we're trying to find a corresponding inventory object (in this case "/xyz/openbmc_project/sensors/current/CPU0_VDDQ_A_IOUT/inventory") to get the list of endpoints for its associations. The first endopoint path is used in a call to addInventoryItem() (in this example it's "/xyz/openbmc_project/inventory/system/board/TiogaPass_Baseboard" for "/xyz/openbmc_project/sensors/current/CPU0_VDDQ_A_IOUT"). Then we again walk the list returned from D-Bus for each entry checking if it represents a LED of any item stored in the current inventory list, and if found the first associated endpoint path is stored for this inventory item. In this example no associated LEDs exist. The resulting inventory items set is passed to the callback getInventoryItemAssociationsCb() defined in getInventoryItems(). 4. getInventoryItemsConnections() is called which makes the following D-Bus call: # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias /xyz/openbmc_project/inventory 0 4 xyz.openbmc_project.Inventory.Item xyz.openbmc_project.Inventory.Item.PowerSupply xyz.openbmc_project.Inventory.Decorator.Asset xyz.openbmc_project.State.Decorator.OperationalStatus For each item returned we see if it's present in our inventory and if yes, add all connections for it (in this example we only add "xyz.openbmc_project.EntityManager" for "/xyz/openbmc_project/inventory/system/board/TiogaPass_Baseboard"). Then it provides all the gathered inventory connections to the "getInventoryItemsConnectionsCb" callback provided by getInventoryItems(). 5. getInventoryItemsData() gets called which for every stored inventory connection tries to find a corresponding path from the object manager data returned earlier (uses "/" if not found) and then performs a D-Bus call (in this example just one): # busctl --json=pretty call xyz.openbmc_project.EntityManager / org.freedesktop.DBus.ObjectManager GetManagedObjects For each inventory path it saves the result with storeInventoryItemData() and then calls getInventoryItemsData() recursively for all the other connections (in this example there was just one, "xyz.openbmc_project.EntityManager"). After processing them all, calls getInventoryItemsDataCb() defined in getInventoryItems(). 6. getInventoryLeds() is called which does this D-Bus call: # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias /xyz/openbmc_project 0 1 xyz.openbmc_project.Led.Physical The result is parsed to see if there's any LED path corresponding to an inventory item, and stores everything found. The collected data is passed to getInventoryLedData() which performs similarly to getInventoryItemsData(). In the example no suitable LEDs are found so it proceeds to calling getInventoryLedsCb() defined in getInventoryItems(). 7. getPowerSupplyAttributes() is called but in this example (since it's requesting a sensor, not power node) it does nothing but proceeds to calling getInventoryItemsCb() defined in processSensorList(). 8. getSensorData() is called which for every sensor connection finds objectMgrPath (if any, else "/" is used) and calls D-Bus (in this example just once since we have a single sensor connection): # busctl --json=pretty call xyz.openbmc_project.PSUSensor / org.freedesktop.DBus.ObjectManager GetManagedObjects This provides all the current data (including all the associations, OperationalStatus and Availability!) for all the sensors handled by this bus (so much more than we actually need in this example). It finally proceeds to building the JSON response (based on the data returned by the call and in some cases the inventory data obtained earlier) and exits; the response (values for the single sensor requested) is sent back to the client. Here follows my naive exploratory patch. It works for the sensors we're interested in (unless they appear after the first Redfish request made to bmcweb as the cache for the expensive D-Bus calls is never invalidated). It doesn't handle complex inventory associations. I would be willing to work on doing the right thing but for that I'm lacking the understanding of the complete picture involved in handling all types of sensors and interfaces, so I'm kindly asking for your help with it. diff --git a/redfish-core/lib/sensors.hpp b/redfish-core/lib/sensors.hpp index ac987359ce5b..266dd25b27e0 100644 --- a/redfish-core/lib/sensors.hpp +++ b/redfish-core/lib/sensors.hpp @@ -51,6 +51,9 @@ static constexpr std::string_view sensors = "Sensors"; static constexpr std::string_view thermal = "Thermal"; } // namespace node +static boost::container::flat_map<std::string, std::string> objectMgrPathsCache; +static GetSubTreeType subTreeCache; + namespace dbus { static const boost::container::flat_map<std::string_view, @@ -244,6 +247,7 @@ void getObjectsWithConnection( return; } + sensors::subTreeCache = subtree; BMCWEB_LOG_DEBUG << "Found " << subtree.size() << " subtrees"; // Make unique list of connections only for requested sensor types and @@ -281,11 +285,19 @@ void getObjectsWithConnection( callback(std::move(connections), std::move(objectsWithConnection)); BMCWEB_LOG_DEBUG << "getObjectsWithConnection resp_handler exit"; }; - // Make call to ObjectMapper to find all sensors objects - crow::connections::systemBus->async_method_call( - std::move(respHandler), "xyz.openbmc_project.ObjectMapper", - "/xyz/openbmc_project/object_mapper", - "xyz.openbmc_project.ObjectMapper", "GetSubTree", path, 2, interfaces); + if (sensors::subTreeCache.empty()) + { + // Make call to ObjectMapper to find all sensors objects + crow::connections::systemBus->async_method_call( + std::move(respHandler), "xyz.openbmc_project.ObjectMapper", + "/xyz/openbmc_project/object_mapper", + "xyz.openbmc_project.ObjectMapper", "GetSubTree", path, 2, + interfaces); + } + else + { + respHandler(boost::system::error_code{}, sensors::subTreeCache); + } BMCWEB_LOG_DEBUG << "getObjectsWithConnection exit"; } @@ -606,15 +618,26 @@ void getObjectManagerPaths(std::shared_ptr<SensorsAsyncResp> SensorsAsyncResp, << objectPath; } } + sensors::objectMgrPathsCache = *objectMgrPaths; callback(objectMgrPaths); BMCWEB_LOG_DEBUG << "getObjectManagerPaths respHandler exit"; }; - // Query mapper for all DBus object paths that implement ObjectManager - crow::connections::systemBus->async_method_call( - std::move(respHandler), "xyz.openbmc_project.ObjectMapper", - "/xyz/openbmc_project/object_mapper", - "xyz.openbmc_project.ObjectMapper", "GetSubTree", "/", 0, interfaces); + if (sensors::objectMgrPathsCache.empty()) + { + // Query mapper for all DBus object paths that implement ObjectManager + crow::connections::systemBus->async_method_call( + std::move(respHandler), "xyz.openbmc_project.ObjectMapper", + "/xyz/openbmc_project/object_mapper", + "xyz.openbmc_project.ObjectMapper", "GetSubTree", "/", 0, + interfaces); + } + else + { + callback(std::make_shared< + boost::container::flat_map<std::string, std::string>>( + sensors::objectMgrPathsCache)); + } BMCWEB_LOG_DEBUG << "getObjectManagerPaths exit"; } @@ -2628,9 +2669,11 @@ inline void processSensorList( }; // Get inventory items associated with sensors - getInventoryItems(SensorsAsyncResp, sensorNames, - objectMgrPaths, - std::move(getInventoryItemsCb)); + // getInventoryItems(SensorsAsyncResp, sensorNames, + // objectMgrPaths, + // std::move(getInventoryItemsCb)); + getInventoryItemsCb( + std::make_shared<std::vector<InventoryItem>>()); BMCWEB_LOG_DEBUG << "getObjectManagerPathsCb exit"; }; @@ -3105,66 +3148,75 @@ class Sensor : public Node const std::array<const char*, 1> interfaces = { "xyz.openbmc_project.Sensor.Value"}; - // Get a list of all of the sensors that implement Sensor.Value - // and get the path and service name associated with the sensor - crow::connections::systemBus->async_method_call( - [asyncResp, sensorName](const boost::system::error_code ec, - const GetSubTreeType& subtree) { - BMCWEB_LOG_DEBUG << "respHandler1 enter"; - if (ec) - { - messages::internalError(asyncResp->res); - BMCWEB_LOG_ERROR << "Sensor getSensorPaths resp_handler: " - << "Dbus error " << ec; - return; - } + auto respHandler = [asyncResp, + sensorName](const boost::system::error_code ec, + const GetSubTreeType& subtree) { + BMCWEB_LOG_DEBUG << "respHandler1 enter"; + if (ec) + { + messages::internalError(asyncResp->res); + BMCWEB_LOG_ERROR << "Sensor getSensorPaths resp_handler: " + << "Dbus error " << ec; + return; + } - GetSubTreeType::const_iterator it = std::find_if( - subtree.begin(), subtree.end(), - [sensorName]( - const std::pair< - std::string, - std::vector<std::pair<std::string, - std::vector<std::string>>>>& - object) { - std::string_view sensor = object.first; - std::size_t lastPos = sensor.rfind("/"); - if (lastPos == std::string::npos || - lastPos + 1 >= sensor.size()) - { - BMCWEB_LOG_ERROR << "Invalid sensor path: " - << sensor; - return false; - } - std::string_view name = sensor.substr(lastPos + 1); + sensors::subTreeCache = subtree; - return name == sensorName; - }); + GetSubTreeType::const_iterator it = std::find_if( + subtree.begin(), subtree.end(), + [sensorName]( + const std::pair< + std::string, + std::vector<std::pair< + std::string, std::vector<std::string>>>>& object) { + std::string_view sensor = object.first; + std::size_t lastPos = sensor.rfind("/"); + if (lastPos == std::string::npos || + lastPos + 1 >= sensor.size()) + { + BMCWEB_LOG_ERROR << "Invalid sensor path: " << sensor; + return false; + } + std::string_view name = sensor.substr(lastPos + 1); - if (it == subtree.end()) - { - BMCWEB_LOG_ERROR << "Could not find path for sensor: " - << sensorName; - messages::resourceNotFound(asyncResp->res, "Sensor", - sensorName); - return; - } - std::string_view sensorPath = (*it).first; - BMCWEB_LOG_DEBUG << "Found sensor path for sensor '" - << sensorName << "': " << sensorPath; + return name == sensorName; + }); - const std::shared_ptr<boost::container::flat_set<std::string>> - sensorList = std::make_shared< - boost::container::flat_set<std::string>>(); + if (it == subtree.end()) + { + BMCWEB_LOG_ERROR << "Could not find path for sensor: " + << sensorName; + messages::resourceNotFound(asyncResp->res, "Sensor", + sensorName); + return; + } + std::string_view sensorPath = (*it).first; + BMCWEB_LOG_DEBUG << "Found sensor path for sensor '" << sensorName + << "': " << sensorPath; - sensorList->emplace(sensorPath); - processSensorList(asyncResp, sensorList); - BMCWEB_LOG_DEBUG << "respHandler1 exit"; - }, - "xyz.openbmc_project.ObjectMapper", - "/xyz/openbmc_project/object_mapper", - "xyz.openbmc_project.ObjectMapper", "GetSubTree", - "/xyz/openbmc_project/sensors", 2, interfaces); + const std::shared_ptr<boost::container::flat_set<std::string>> + sensorList = + std::make_shared<boost::container::flat_set<std::string>>(); + + sensorList->emplace(sensorPath); + processSensorList(asyncResp, sensorList); + BMCWEB_LOG_DEBUG << "respHandler1 exit"; + }; + + if (sensors::subTreeCache.empty()) + { + // Get a list of all of the sensors that implement Sensor.Value + // and get the path and service name associated with the sensor + crow::connections::systemBus->async_method_call( + std::move(respHandler), "xyz.openbmc_project.ObjectMapper", + "/xyz/openbmc_project/object_mapper", + "xyz.openbmc_project.ObjectMapper", "GetSubTree", + "/xyz/openbmc_project/sensors", 2, interfaces); + } + else + { + respHandler(boost::system::error_code{}, sensors::subTreeCache); + } } }; Thank you for your attention. -- Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software! mailto:fercerpav@gmail.com ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Redfish requests for dbus-sensors data are needlessly slow - analysis 2021-07-29 9:46 Redfish requests for dbus-sensors data are needlessly slow - analysis Paul Fertser @ 2021-07-29 14:53 ` Ed Tanous 2021-07-29 15:29 ` Paul Fertser 2021-08-03 3:58 ` Andrew Jeffery 0 siblings, 2 replies; 9+ messages in thread From: Ed Tanous @ 2021-07-29 14:53 UTC (permalink / raw) To: Paul Fertser Cc: Zhikui Ren, Jae Hyun Yoo, Andrei Tutolmin, Vernon Mauery, OpenBMC Maillist, Timofei Nikitin, Jason Bills, Gunnar Mills On Thu, Jul 29, 2021 at 2:46 AM Paul Fertser <fercerpav@gmail.com> wrote: > > Hello, > > Short version: > > Our operations team complains about very high (seconds) latency of > requesting sensor data used to monitor health of the servers via > Redfish. Boards are OCP Tioga Pass with AST2500 BMC. Software > involved: OpenBMC bmcweb, dbus-sensors, ObjectMapper, EntityManager. > > In this mail I provide details about the testing methodology, the > complete call chain (including RPC) involved in handling Redfish > requests, and a quick-and-dirty patch that caches some D-Bus replies > in bmcweb that results in cutting down response times by more than 4 > times. It also gives significantly less CPU overhead. > > My impression so far is that the current way is rather wasteful and > can be considerably improved. > > Our question is basically whether what we see is normal, if we should > change the way servers are monitored, if bmcweb should really be > better optimised and if yes, how exactly? > > > Longer version: > > Our operations team is normally using Zabbix for monitoring, and they > developed a set of scripts and templates to handle OpenBMC machines: > https://github.com/Gagar-in/redfish-bmc-lld . > > For easier reproduction and analysis I instead use plain curl command > to request data for a single specific sensor. The following text > assumes we're interested to get output current of the first DRAM VR > (handled by pxe1610 kernel driver and PSUSensor OpenBMC daemon which > caches the values periodically read from hwmon nodes). > > To make sure I'm measuring just the time it takes to answer the GET > request itself (and not counting any TLS session establishment > overhead as curl is reusing an open connection) I'm running this > command: > > $ curl -w '\n\nTime spent: %{time_total}\n\n\n' -k $(for i in $(seq 5); do echo https://root:0penBmc@$BMCIP/redfish/v1/Chassis/TiogaPass_Baseboard/Sensors/CPU0_VDDQ_A_IOUT; done) > > I typically get over 2 seconds for each request made. With the patch > at the end of the mail the same test consistently shows times below > half a second. > > The major part of the delay comes from numerous (and probably > suboptimal) D-Bus calls, plus the overhead to process the responses. > > Here follows D-Bus profiling data typical for each Redfish sensors > request made. "mc" is the method call performed by "bmcweb", "mr" is > the response. I've added the second column (by processing dbus-monitor > --profile output through awk -vOFS='\t' '{ d=$2-prev; prev=$2; > $2=d"\t"$2; print $0 }' ) to show the difference between the current > timestamp and the previous one so it's visible how much time it takes > for the daemons to answer each request and how much time bmcweb needs > to process the result. > > mc --------- 1625473150.291539 1665 :1.79 xyz.openbmc_project.User.Manager /xyz/openbmc_project/user xyz.openbmc_project.User.Manager GetUserInfo > mr 0.0463991 1625473150.337938 469 :1.36 :1.79 1665 > mc 0.00556493 1625473150.343503 1666 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree > err 0.0143089 1625473150.357812 470 :1.36 :1.79 1665 > mr 0.0478861 1625473150.405698 5891 :1.49 :1.79 1666 > mc 0.0611379 1625473150.466836 1667 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree > mr 0.0716419 1625473150.538478 5893 :1.49 :1.79 1667 > mc 0.0977321 1625473150.636210 1668 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree > mr 0.0268672 1625473150.663077 5895 :1.49 :1.79 1668 > mc 0.0232749 1625473150.686352 1669 :1.79 xyz.openbmc_project.ObjectMapper / org.freedesktop.DBus.ObjectManager GetManagedObjects > mr 0.156963 1625473150.843315 5896 :1.49 :1.79 1669 > mc 0.428404 1625473151.271719 1670 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree > mr 0.030153 1625473151.301872 5898 :1.49 :1.79 1670 > mc 0.123439 1625473151.425311 1671 :1.79 xyz.openbmc_project.EntityManager / org.freedesktop.DBus.ObjectManager GetManagedObjects > mr 0.284928 1625473151.710239 1408 :1.63 :1.79 1671 > mc 0.108877 1625473151.819116 1672 :1.79 xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree > mr 0.0525708 1625473151.871687 5900 :1.49 :1.79 1672 > mc 0.0175102 1625473151.889197 1673 :1.79 xyz.openbmc_project.PSUSensor / org.freedesktop.DBus.ObjectManager GetManagedObjects > mr 0.119283 1625473152.008480 124202 :1.104 :1.79 1673 > > > To provide more insight into what and why bmcweb is doing I made the > following call chain description by carefully reading > redfish-core/lib/sensors.hpp source code: > > 0. requestRoutesSensor() is called to process the GET request. > > For further processing a D-Bus call is made. For experimenting on a > running system I'm providing copy-pasteable "busctl" invocations for > all the calls involved. > > # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias /xyz/openbmc_project/sensors 2 1 xyz.openbmc_project.Sensor.Value > > The returned array is searched for the requested sensor name, a > flat_set is created with a single string containing D-Bus sensor path > (in this example > "/xyz/openbmc_project/sensors/current/CPU0_VDDQ_A_IOUT"). The flat_set > is passed to processSensorList(). > > > 1. processSensorList() calls getConnections() which calls > getObjectsWithConnection() (the latter can also be called from > setSensorsOverride()). > > It does this call (exactly the same as previous!): > > # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias /xyz/openbmc_project/sensors 2 1 xyz.openbmc_project.Sensor.Value > > For each returned subtree it determines if it's one of the sensors we > requested. In this case the bus names are added to a "connections" > flat_set (in this example it's just "xyz.openbmc_project.PSUSensor") > and a set of tuples "objectsWithConnection" is created (in this > example it's > ("/xyz/openbmc_project/sensors/current/CPU0_VDDQ_A_IOUT","xyz.openbmc_project.PSUSensor") > . "connections" is passed to the callback provided by > processSensorList(), "objectsWithConnection" is ignored. > > > 2. getConnectionCb() defined in processSensorList() calls getObjectManagerPaths(). > > The following call is made: > > # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias / 0 1 org.freedesktop.DBus.ObjectManager > > The result is converted to a "objectMgrPaths" associative array, with > connection used as a key and path as a value (in this example the > essential element would be > objectMgrPaths["xyz.openbmc_project.PSUSensor"]="/" and > objectMgrPaths["xyz.openbmc_project.ObjectMapper"]="/"). This map is > passed to getObjectManagerPathsCb() defined inside processSensorList() > > > 3. getInventoryItems() calls getInventoryItemAssociations() which uses > the objectMgrPaths map to look for the path for > "xyz.openbmc_project.ObjectMapper". If it's not found, "/" is assumed. > > The D-Bus call is made: > > # busctl --json=pretty call xyz.openbmc_project.ObjectMapper / org.freedesktop.DBus.ObjectManager GetManagedObjects > > For each sensor path we're interested in we're trying to find a > corresponding inventory object (in this case > "/xyz/openbmc_project/sensors/current/CPU0_VDDQ_A_IOUT/inventory") to > get the list of endpoints for its associations. The first endopoint > path is used in a call to addInventoryItem() (in this example it's > "/xyz/openbmc_project/inventory/system/board/TiogaPass_Baseboard" for > "/xyz/openbmc_project/sensors/current/CPU0_VDDQ_A_IOUT"). > > Then we again walk the list returned from D-Bus for each entry > checking if it represents a LED of any item stored in the current > inventory list, and if found the first associated endpoint path is > stored for this inventory item. In this example no associated LEDs > exist. > > The resulting inventory items set is passed to the callback > getInventoryItemAssociationsCb() defined in getInventoryItems(). > > > 4. getInventoryItemsConnections() is called which makes the following > D-Bus call: > > # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias /xyz/openbmc_project/inventory 0 4 xyz.openbmc_project.Inventory.Item xyz.openbmc_project.Inventory.Item.PowerSupply xyz.openbmc_project.Inventory.Decorator.Asset xyz.openbmc_project.State.Decorator.OperationalStatus > > For each item returned we see if it's present in our inventory and if > yes, add all connections for it (in this example we only add > "xyz.openbmc_project.EntityManager" for > "/xyz/openbmc_project/inventory/system/board/TiogaPass_Baseboard"). Then > it provides all the gathered inventory connections to the > "getInventoryItemsConnectionsCb" callback provided by > getInventoryItems(). > > > 5. getInventoryItemsData() gets called which for every stored > inventory connection tries to find a corresponding path from the > object manager data returned earlier (uses "/" if not found) and then > performs a D-Bus call (in this example just one): > > # busctl --json=pretty call xyz.openbmc_project.EntityManager / org.freedesktop.DBus.ObjectManager GetManagedObjects > > For each inventory path it saves the result with > storeInventoryItemData() and then calls getInventoryItemsData() > recursively for all the other connections (in this example there was > just one, "xyz.openbmc_project.EntityManager"). After processing them > all, calls getInventoryItemsDataCb() defined in getInventoryItems(). > > > 6. getInventoryLeds() is called which does this D-Bus call: > > # busctl --json=pretty call xyz.openbmc_project.ObjectMapper /xyz/openbmc_project/object_mapper xyz.openbmc_project.ObjectMapper GetSubTree sias /xyz/openbmc_project 0 1 xyz.openbmc_project.Led.Physical > > The result is parsed to see if there's any LED path corresponding to > an inventory item, and stores everything found. The collected data is > passed to getInventoryLedData() which performs similarly to > getInventoryItemsData(). In the example no suitable LEDs are found so > it proceeds to calling getInventoryLedsCb() defined in > getInventoryItems(). > > > 7. getPowerSupplyAttributes() is called but in this example (since > it's requesting a sensor, not power node) it does nothing but proceeds > to calling getInventoryItemsCb() defined in processSensorList(). > > > 8. getSensorData() is called which for every sensor connection finds > objectMgrPath (if any, else "/" is used) and calls D-Bus (in this > example just once since we have a single sensor connection): > > # busctl --json=pretty call xyz.openbmc_project.PSUSensor / org.freedesktop.DBus.ObjectManager GetManagedObjects > > This provides all the current data (including all the associations, > OperationalStatus and Availability!) for all the sensors handled by > this bus (so much more than we actually need in this example). > > It finally proceeds to building the JSON response (based on the data > returned by the call and in some cases the inventory data obtained > earlier) and exits; the response (values for the single sensor > requested) is sent back to the client. > > > Here follows my naive exploratory patch. It works for the sensors > we're interested in (unless they appear after the first Redfish > request made to bmcweb as the cache for the expensive D-Bus calls is > never invalidated). It doesn't handle complex inventory associations. The "unless" is kind of important here. Sensors can show up or be removed at any time, especially on an entity-manager enabled system. As a general rule, bmcweb has avoided having a cache at all. There have been several attempts at adding one, but most failed because of the same thing you found: cache invalidation is hard. If we're going to implement a cache, I'd really like to see it done at a higher level (somewhere in the connection class) and done more generically than this is here, which would ensure that all endpoints are sped up, not just sensors. With that said, if this really solves the problem and solves it well, I can't very well ignore it. > > I would be willing to work on doing the right thing but for that I'm > lacking the understanding of the complete picture involved in handling > all types of sensors and interfaces, so I'm kindly asking for your > help with it. > > diff --git a/redfish-core/lib/sensors.hpp b/redfish-core/lib/sensors.hpp Can you please submit the below to gerrit as a WIP instead. There's much better tooling there for reviewing and testing patches. I can review it more there. FWIW, it's on my list to look into the mapper shared-memory caching layer that @arj wrote a while back. It might also be able to solve this problem; I don't have the link offhand, but I think it was trying to solve a similar problem, but handled the cache invalidation issue as well. > index ac987359ce5b..266dd25b27e0 100644 > --- a/redfish-core/lib/sensors.hpp > +++ b/redfish-core/lib/sensors.hpp > @@ -51,6 +51,9 @@ static constexpr std::string_view sensors = "Sensors"; > static constexpr std::string_view thermal = "Thermal"; > } // namespace node > > +static boost::container::flat_map<std::string, std::string> objectMgrPathsCache; > +static GetSubTreeType subTreeCache; > + > namespace dbus > { > static const boost::container::flat_map<std::string_view, > @@ -244,6 +247,7 @@ void getObjectsWithConnection( > return; > } > > + sensors::subTreeCache = subtree; > BMCWEB_LOG_DEBUG << "Found " << subtree.size() << " subtrees"; > > // Make unique list of connections only for requested sensor types and > @@ -281,11 +285,19 @@ void getObjectsWithConnection( > callback(std::move(connections), std::move(objectsWithConnection)); > BMCWEB_LOG_DEBUG << "getObjectsWithConnection resp_handler exit"; > }; > - // Make call to ObjectMapper to find all sensors objects > - crow::connections::systemBus->async_method_call( > - std::move(respHandler), "xyz.openbmc_project.ObjectMapper", > - "/xyz/openbmc_project/object_mapper", > - "xyz.openbmc_project.ObjectMapper", "GetSubTree", path, 2, interfaces); > + if (sensors::subTreeCache.empty()) > + { > + // Make call to ObjectMapper to find all sensors objects > + crow::connections::systemBus->async_method_call( > + std::move(respHandler), "xyz.openbmc_project.ObjectMapper", > + "/xyz/openbmc_project/object_mapper", > + "xyz.openbmc_project.ObjectMapper", "GetSubTree", path, 2, > + interfaces); > + } > + else > + { > + respHandler(boost::system::error_code{}, sensors::subTreeCache); > + } > BMCWEB_LOG_DEBUG << "getObjectsWithConnection exit"; > } > > @@ -606,15 +618,26 @@ void getObjectManagerPaths(std::shared_ptr<SensorsAsyncResp> SensorsAsyncResp, > << objectPath; > } > } > + sensors::objectMgrPathsCache = *objectMgrPaths; > callback(objectMgrPaths); > BMCWEB_LOG_DEBUG << "getObjectManagerPaths respHandler exit"; > }; > > - // Query mapper for all DBus object paths that implement ObjectManager > - crow::connections::systemBus->async_method_call( > - std::move(respHandler), "xyz.openbmc_project.ObjectMapper", > - "/xyz/openbmc_project/object_mapper", > - "xyz.openbmc_project.ObjectMapper", "GetSubTree", "/", 0, interfaces); > + if (sensors::objectMgrPathsCache.empty()) > + { > + // Query mapper for all DBus object paths that implement ObjectManager > + crow::connections::systemBus->async_method_call( > + std::move(respHandler), "xyz.openbmc_project.ObjectMapper", > + "/xyz/openbmc_project/object_mapper", > + "xyz.openbmc_project.ObjectMapper", "GetSubTree", "/", 0, > + interfaces); > + } > + else > + { > + callback(std::make_shared< > + boost::container::flat_map<std::string, std::string>>( > + sensors::objectMgrPathsCache)); > + } > BMCWEB_LOG_DEBUG << "getObjectManagerPaths exit"; > } > > @@ -2628,9 +2669,11 @@ inline void processSensorList( > }; > > // Get inventory items associated with sensors > - getInventoryItems(SensorsAsyncResp, sensorNames, > - objectMgrPaths, > - std::move(getInventoryItemsCb)); > + // getInventoryItems(SensorsAsyncResp, sensorNames, > + // objectMgrPaths, > + // std::move(getInventoryItemsCb)); > + getInventoryItemsCb( > + std::make_shared<std::vector<InventoryItem>>()); > > BMCWEB_LOG_DEBUG << "getObjectManagerPathsCb exit"; > }; > @@ -3105,66 +3148,75 @@ class Sensor : public Node > const std::array<const char*, 1> interfaces = { > "xyz.openbmc_project.Sensor.Value"}; > > - // Get a list of all of the sensors that implement Sensor.Value > - // and get the path and service name associated with the sensor > - crow::connections::systemBus->async_method_call( > - [asyncResp, sensorName](const boost::system::error_code ec, > - const GetSubTreeType& subtree) { > - BMCWEB_LOG_DEBUG << "respHandler1 enter"; > - if (ec) > - { > - messages::internalError(asyncResp->res); > - BMCWEB_LOG_ERROR << "Sensor getSensorPaths resp_handler: " > - << "Dbus error " << ec; > - return; > - } > + auto respHandler = [asyncResp, > + sensorName](const boost::system::error_code ec, > + const GetSubTreeType& subtree) { > + BMCWEB_LOG_DEBUG << "respHandler1 enter"; > + if (ec) > + { > + messages::internalError(asyncResp->res); > + BMCWEB_LOG_ERROR << "Sensor getSensorPaths resp_handler: " > + << "Dbus error " << ec; > + return; > + } > > - GetSubTreeType::const_iterator it = std::find_if( > - subtree.begin(), subtree.end(), > - [sensorName]( > - const std::pair< > - std::string, > - std::vector<std::pair<std::string, > - std::vector<std::string>>>>& > - object) { > - std::string_view sensor = object.first; > - std::size_t lastPos = sensor.rfind("/"); > - if (lastPos == std::string::npos || > - lastPos + 1 >= sensor.size()) > - { > - BMCWEB_LOG_ERROR << "Invalid sensor path: " > - << sensor; > - return false; > - } > - std::string_view name = sensor.substr(lastPos + 1); > + sensors::subTreeCache = subtree; > > - return name == sensorName; > - }); > + GetSubTreeType::const_iterator it = std::find_if( > + subtree.begin(), subtree.end(), > + [sensorName]( > + const std::pair< > + std::string, > + std::vector<std::pair< > + std::string, std::vector<std::string>>>>& object) { > + std::string_view sensor = object.first; > + std::size_t lastPos = sensor.rfind("/"); > + if (lastPos == std::string::npos || > + lastPos + 1 >= sensor.size()) > + { > + BMCWEB_LOG_ERROR << "Invalid sensor path: " << sensor; > + return false; > + } > + std::string_view name = sensor.substr(lastPos + 1); > > - if (it == subtree.end()) > - { > - BMCWEB_LOG_ERROR << "Could not find path for sensor: " > - << sensorName; > - messages::resourceNotFound(asyncResp->res, "Sensor", > - sensorName); > - return; > - } > - std::string_view sensorPath = (*it).first; > - BMCWEB_LOG_DEBUG << "Found sensor path for sensor '" > - << sensorName << "': " << sensorPath; > + return name == sensorName; > + }); > > - const std::shared_ptr<boost::container::flat_set<std::string>> > - sensorList = std::make_shared< > - boost::container::flat_set<std::string>>(); > + if (it == subtree.end()) > + { > + BMCWEB_LOG_ERROR << "Could not find path for sensor: " > + << sensorName; > + messages::resourceNotFound(asyncResp->res, "Sensor", > + sensorName); > + return; > + } > + std::string_view sensorPath = (*it).first; > + BMCWEB_LOG_DEBUG << "Found sensor path for sensor '" << sensorName > + << "': " << sensorPath; > > - sensorList->emplace(sensorPath); > - processSensorList(asyncResp, sensorList); > - BMCWEB_LOG_DEBUG << "respHandler1 exit"; > - }, > - "xyz.openbmc_project.ObjectMapper", > - "/xyz/openbmc_project/object_mapper", > - "xyz.openbmc_project.ObjectMapper", "GetSubTree", > - "/xyz/openbmc_project/sensors", 2, interfaces); > + const std::shared_ptr<boost::container::flat_set<std::string>> > + sensorList = > + std::make_shared<boost::container::flat_set<std::string>>(); > + > + sensorList->emplace(sensorPath); > + processSensorList(asyncResp, sensorList); > + BMCWEB_LOG_DEBUG << "respHandler1 exit"; > + }; > + > + if (sensors::subTreeCache.empty()) > + { > + // Get a list of all of the sensors that implement Sensor.Value > + // and get the path and service name associated with the sensor > + crow::connections::systemBus->async_method_call( > + std::move(respHandler), "xyz.openbmc_project.ObjectMapper", > + "/xyz/openbmc_project/object_mapper", > + "xyz.openbmc_project.ObjectMapper", "GetSubTree", > + "/xyz/openbmc_project/sensors", 2, interfaces); > + } > + else > + { > + respHandler(boost::system::error_code{}, sensors::subTreeCache); > + } > } > }; > > > Thank you for your attention. > > -- > Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software! > mailto:fercerpav@gmail.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Redfish requests for dbus-sensors data are needlessly slow - analysis 2021-07-29 14:53 ` Ed Tanous @ 2021-07-29 15:29 ` Paul Fertser 2021-07-29 16:08 ` Ed Tanous 2021-08-03 3:58 ` Andrew Jeffery 1 sibling, 1 reply; 9+ messages in thread From: Paul Fertser @ 2021-07-29 15:29 UTC (permalink / raw) To: Ed Tanous Cc: Zhikui Ren, Jae Hyun Yoo, Andrei Tutolmin, Vernon Mauery, OpenBMC Maillist, Timofei Nikitin, Jason Bills, Gunnar Mills Hello Ed, Thank you for the reply. On Thu, Jul 29, 2021 at 07:53:38AM -0700, Ed Tanous wrote: > On Thu, Jul 29, 2021 at 2:46 AM Paul Fertser <fercerpav@gmail.com> wrote: > > Here follows my naive exploratory patch. It works for the sensors > > we're interested in (unless they appear after the first Redfish > > request made to bmcweb as the cache for the expensive D-Bus calls is > > never invalidated). It doesn't handle complex inventory associations. > > The "unless" is kind of important here. Sensors can show up or be > removed at any time, especially on an entity-manager enabled system. So bmcweb can subscribe to the signals from entity-manager and invalidate the caches just in case if any EM event registered. > As a general rule, bmcweb has avoided having a cache at all. There > have been several attempts at adding one, but most failed because of > the same thing you found: cache invalidation is hard. Even if it's just dropped on any ObjectMapper or EntityManager signal? > If we're going to implement a cache, I'd really like to see it done > at a higher level (somewhere in the connection class) and done more > generically than this is here, which would ensure that all endpoints > are sped up, not just sensors. With that said, if this really > solves the problem and solves it well, I can't very well ignore it. But is it really a problem? I'm still not sure; probably it's just that we invented a weird usecase not worth optimising for? Have the others faced it and do they consider it to be a problem? > > I would be willing to work on doing the right thing but for that I'm > > lacking the understanding of the complete picture involved in handling > > all types of sensors and interfaces, so I'm kindly asking for your > > help with it. > > > > diff --git a/redfish-core/lib/sensors.hpp b/redfish-core/lib/sensors.hpp > > Can you please submit the below to gerrit as a WIP instead. There's > much better tooling there for reviewing and testing patches. I can > review it more there. Sure, will do. I didn't mean to offer it as a solution, it's just to prove my understanding of where the bottlenecks are was about correct. My real point is that apparently bmcweb is (in some cases) requesting way too much data in way too many calls, using only a tiny fraction of the obtained values in the end. That's why I made that lengthy description of how data for a single sensor is obtained. Caching can hide this to a degree, but shouldn't the root cause be fixed as well? Does bmcweb really _need_ to do all the calls mentioned? I think the fact it performs two identical D-Bus calls in a row gives a clear "no" answer. After reading my outline of the process can you tell that's exactly the way the OpenBMC D-Bus API is designed to be used? -- Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software! mailto:fercerpav@gmail.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Redfish requests for dbus-sensors data are needlessly slow - analysis 2021-07-29 15:29 ` Paul Fertser @ 2021-07-29 16:08 ` Ed Tanous 2021-08-04 10:41 ` Paul Fertser 0 siblings, 1 reply; 9+ messages in thread From: Ed Tanous @ 2021-07-29 16:08 UTC (permalink / raw) To: Paul Fertser Cc: Zhikui Ren, Jae Hyun Yoo, Andrei Tutolmin, Vernon Mauery, OpenBMC Maillist, Timofei Nikitin, Jason Bills, Gunnar Mills On Thu, Jul 29, 2021 at 8:29 AM Paul Fertser <fercerpav@gmail.com> wrote: > > Hello Ed, > > Thank you for the reply. > > On Thu, Jul 29, 2021 at 07:53:38AM -0700, Ed Tanous wrote: > > On Thu, Jul 29, 2021 at 2:46 AM Paul Fertser <fercerpav@gmail.com> wrote: > > > Here follows my naive exploratory patch. It works for the sensors > > > we're interested in (unless they appear after the first Redfish > > > request made to bmcweb as the cache for the expensive D-Bus calls is > > > never invalidated). It doesn't handle complex inventory associations. > > > > The "unless" is kind of important here. Sensors can show up or be > > removed at any time, especially on an entity-manager enabled system. > > So bmcweb can subscribe to the signals from entity-manager and > invalidate the caches just in case if any EM event registered. You have to monitor signals from all connections. bmcweb doesn't only run against entity-manager. FWIW, what you described is essentially entirely what the mapper does; Handling all the race conditions, add events, remove events, with and without object manager support was non-trivial, and this is why Andrews attempt tries to put it into a library so we don't lose that code. None of this is to say "don't try".... please, give it an attempt and let's see if we can tease something good out here, but just understand that there are race conditions there. > > > As a general rule, bmcweb has avoided having a cache at all. There > > have been several attempts at adding one, but most failed because of > > the same thing you found: cache invalidation is hard. > > Even if it's just dropped on any ObjectMapper or EntityManager signal? Those aren't the only signals you need to monitor. If you're interested in going down this route, I'd start by looking at the mapper implementation. You'd also have to handle connections, disconnections, and other stuff. At some point, what you've done starts to look a lot like the mapper. > > > If we're going to implement a cache, I'd really like to see it done > > at a higher level (somewhere in the connection class) and done more > > generically than this is here, which would ensure that all endpoints > > are sped up, not just sensors. With that said, if this really > > solves the problem and solves it well, I can't very well ignore it. > > But is it really a problem? I'm still not sure; probably it's just > that we invented a weird usecase not worth optimising for? Have the > others faced it and do they consider it to be a problem? It's for sure a problem, although your 4 second response times sound worse than most. On an ast2500 I've normally seen a 700ms (ish) response time, but depending on how many sensors you have, and how fast the backends are for each one, I could easily see it getting up there pretty easily. Just one thing to double check; You don't have any custom sensor daemons running on your system do you? Ideally everything should be implementing object manager. If it doesn't, the code paths to build the sensor schema has to take the very slow path (ie calling Get on each property individually). > > > > I would be willing to work on doing the right thing but for that I'm > > > lacking the understanding of the complete picture involved in handling > > > all types of sensors and interfaces, so I'm kindly asking for your > > > help with it. > > > > > > diff --git a/redfish-core/lib/sensors.hpp b/redfish-core/lib/sensors.hpp > > > > Can you please submit the below to gerrit as a WIP instead. There's > > much better tooling there for reviewing and testing patches. I can > > review it more there. > > Sure, will do. I didn't mean to offer it as a solution, it's just to > prove my understanding of where the bottlenecks are was about correct. > > My real point is that apparently bmcweb is (in some cases) requesting > way too much data in way too many calls, using only a tiny fraction of > the obtained values in the end. That's why I made that lengthy > description of how data for a single sensor is obtained. Caching can > hide this to a degree, but shouldn't the root cause be fixed as well? Yes, we should come at it from both angles, because a cache will still have to invalidate quite a bit, and nobody likes a bimodal distribution of response times. The next thing on my personal list is to get a new version of GetSubtree added to the mapper, that accepts a list of optional interfaces as well as required. I talked about this a while back on discord. This might clean up a couple of the extra calls in the sensor code, but the primary target was inventory initially. In a perfect world where software is easy and architectures are beautiful, ideally I'd like to see most redfish calls go: Call mapper once with a list of interfaces; Find all connections and objects that implement this interface. Loop over connections, calling GetManagedObjects on each to read in the data and write directly to redfish tree. Send response > Does bmcweb really _need_ to do all the calls mentioned? I think the > fact it performs two identical D-Bus calls in a row gives a clear "no" > answer. You are almost certainly right, two identical calls is wasteful, and we should clean that up if we can. I'm certain there's probably several speedups to be gained by someone understanding the sensor code and optimizing it a little. > After reading my outline of the process can you tell that's > exactly the way the OpenBMC D-Bus API is designed to be used? I'm not sure I quite follow this question. The sensor code you're looking at evolved significantly over time into the behemoth you see today. It quite likely has several anti-patterns and duplicated calls that can be consolidated, but that didn't really show up in the individual diffs as features were added. I'm really happy someone has gone through the full flow. Ideally we can start cleaning up the obvious stuff (duplicated calls, ect) in some patchsets. > > -- > Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software! > mailto:fercerpav@gmail.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Redfish requests for dbus-sensors data are needlessly slow - analysis 2021-07-29 16:08 ` Ed Tanous @ 2021-08-04 10:41 ` Paul Fertser 2021-08-04 18:51 ` Ed Tanous 0 siblings, 1 reply; 9+ messages in thread From: Paul Fertser @ 2021-08-04 10:41 UTC (permalink / raw) To: Ed Tanous Cc: Zhikui Ren, Jae Hyun Yoo, Andrei Tutolmin, Vernon Mauery, OpenBMC Maillist, Timofei Nikitin, Jason Bills, Gunnar Mills Hello Ed, So as promised I sent the patch to Gerrit on Friday, https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/45404 and you were automatically added as a Reviewer but I haven't worked with WiP before so I have no idea if you got a notification or not (I didn't press the START REVIEW button). I'm ready to provide a documentation patch to the "OpenBMC beginners" instructions once I learn how exactly WiP patches are usually employed by the community. The build is not happy with it as I didn't use clang-format conciously, the diff before it is much clearer (understandble) in this particular case. On Thu, Jul 29, 2021 at 09:08:30AM -0700, Ed Tanous wrote: > On Thu, Jul 29, 2021 at 8:29 AM Paul Fertser <fercerpav@gmail.com> wrote: > > So bmcweb can subscribe to the signals from entity-manager and > > invalidate the caches just in case if any EM event registered. > > You have to monitor signals from all connections. bmcweb doesn't only > run against entity-manager. FWIW, what you described is essentially > entirely what the mapper does; See how little clue I have, that's exactly the reason I'm addressing you and the other maintainers because I'm still unable to wrap my head around all the EM and mapper interactions and what it really takes _and why_ to get essential information about the system health. > > > As a general rule, bmcweb has avoided having a cache at all. There > > > have been several attempts at adding one, but most failed because of > > > the same thing you found: cache invalidation is hard. > > > > Even if it's just dropped on any ObjectMapper or EntityManager signal? > > Those aren't the only signals you need to monitor. If you're > interested in going down this route, I'd start by looking at the > mapper implementation. You'd also have to handle connections, > disconnections, and other stuff. At some point, what you've done > starts to look a lot like the mapper. Understood. So if we have the mapper already why isn't it enough to issue just one call to it, process the reply, and then request directly the data bmcweb needs? > > > If we're going to implement a cache, I'd really like to see it done > > > at a higher level (somewhere in the connection class) and done more > > > generically than this is here, which would ensure that all endpoints > > > are sped up, not just sensors. With that said, if this really > > > solves the problem and solves it well, I can't very well ignore it. > > > > But is it really a problem? I'm still not sure; probably it's just > > that we invented a weird usecase not worth optimising for? Have the > > others faced it and do they consider it to be a problem? > > It's for sure a problem, although your 4 second response times sound > worse than most. It's usually about 2 seconds, and with my silly caching I cut it down to half a second (four times). > On an ast2500 I've normally seen a 700ms (ish) response time, but > depending on how many sensors you have, and how fast the backends > are for each one, I could easily see it getting up there pretty > easily. PSUSensor backend is caching hwmon values so it's reasonably fast. And in my first mail I provided detailed timings of each step bmcweb performs, so it's clear it doesn't add unreasonably much. The system in question (Tioga Pass) shows 153 sensors (with dual 26-cores Xeons running). > Just one thing to double check; You don't have any custom sensor > daemons running on your system do you? Ideally everything should be > implementing object manager. If it doesn't, the code paths to build > the sensor schema has to take the very slow path (ie calling Get on > each property individually). In my first mail on the topic I detailed all the calls, and AFAICT there's nothing custom and no Get requests at all. Can I be missing anything? > > My real point is that apparently bmcweb is (in some cases) requesting > > way too much data in way too many calls, using only a tiny fraction of > > the obtained values in the end. That's why I made that lengthy > > description of how data for a single sensor is obtained. Caching can > > hide this to a degree, but shouldn't the root cause be fixed as well? > > Yes, we should come at it from both angles, because a cache will > still have to invalidate quite a bit, and nobody likes a bimodal > distribution of response times. This sounds cool but I'm not sure I understand the point. We face caching everywhere, and it's usually multi-layered, so when appropriately benchmarked we often see tri- and more modal distributions. And when not building a hard real-time system the median is much more important than the modality I guess? > The next thing on my personal list is to get a new version of > GetSubtree added to the mapper, that accepts a list of optional > interfaces as well as required. I talked about this a while back on > discord. The d-word offences me. Their ToS (and specifically the explicit prohibition of "third-party" clients) make it outright absolutely unacceptable to use the (mis)service. > This might clean up a couple of the extra calls in the > sensor code, but the primary target was inventory initially. I'm still rather puzzled by the relationship between different sensors/entities/inventory/leds/etc. Is there a document describing everything that's currently essential? > In a perfect world where software is easy and architectures are > beautiful, ideally I'd like to see most redfish calls go: > Call mapper once with a list of interfaces; Find all connections and > objects that implement this interface. > Loop over connections, calling GetManagedObjects on each to read in > the data and write directly to redfish tree. So does anyone know what exactly is not perfect about the current world that makes bmcweb so convoluted? > > Does bmcweb really _need_ to do all the calls mentioned? I think the > > fact it performs two identical D-Bus calls in a row gives a clear "no" > > answer. > > You are almost certainly right, two identical calls is wasteful, and > we should clean that up if we can. I'm certain there's probably > several speedups to be gained by someone understanding the sensor code > and optimizing it a little. I'm happy I managed to get my point through finally. > > After reading my outline of the process can you tell that's > > exactly the way the OpenBMC D-Bus API is designed to be used? > > I'm not sure I quite follow this question. The sensor code you're > looking at evolved significantly over time into the behemoth you see > today. It quite likely has several anti-patterns and duplicated calls > that can be consolidated, but that didn't really show up in the > individual diffs as features were added. That's perfectly understandable, happens to every software projects that's not stale. Not only software even, our own bodies have some rather "interesting" solutions too, that's a natural result of an evolution. > I'm really happy someone has gone through the full flow. Ideally we > can start cleaning up the obvious stuff (duplicated calls, ect) in > some patchsets. Since I do not understand the role of /inventory/ pathes, and I do not understand how LEDs get involved (apparently we do not have any LEDs like that in our Tioga Pass configuration) and I didn't even try to dig the "power" nodes I'm certainly not in a position to offer a worthy code contribution in this area :( (other than fixing the duplicated call which can trivially be done by passing an extra argument; but that alone isn't fruitful much) I very much hoped for more involvement from the bmcweb maintainers, as solving this issue properly, in a way that doesn't break any usecases obviously requires plenty of domain-specific knowledge (domain here being the extensive OpenBMC D-Bus API, specifically the sensors/inventory part of it). Guess I'm at fault here and doing something wrong, and it's not the first time I fail to understand how to interact with the community. E.g. I'm still waiting for an answer to a straightforward question asked in https://lists.ozlabs.org/pipermail/openbmc/2021-July/027001.html . What am I doing wrong? Am I copying the wrong people? Am I asking at the wrong place? Should I be expecting something else entirely? I have some reasonable experience interacting with other Free Software communities but somehow it doesn't seem to be applicable here? Thank you for your time. -- Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software! mailto:fercerpav@gmail.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Redfish requests for dbus-sensors data are needlessly slow - analysis 2021-08-04 10:41 ` Paul Fertser @ 2021-08-04 18:51 ` Ed Tanous 2021-08-05 18:57 ` Paul Fertser 0 siblings, 1 reply; 9+ messages in thread From: Ed Tanous @ 2021-08-04 18:51 UTC (permalink / raw) To: Paul Fertser Cc: Zhikui Ren, Jae Hyun Yoo, Andrei Tutolmin, Vernon Mauery, OpenBMC Maillist, Ed Tanous, Timofei Nikitin, Jason Bills, Gunnar Mills On Wed, Aug 4, 2021 at 3:41 AM Paul Fertser <fercerpav@gmail.com> wrote: > > Hello Ed, > > So as promised I sent the patch to Gerrit on Friday, > https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/45404 and you > were automatically added as a Reviewer but I haven't worked with WiP > before so I have no idea if you got a notification or not (I didn't > press the START REVIEW button). I'm ready to provide a documentation > patch to the "OpenBMC beginners" instructions once I learn how exactly > WiP patches are usually employed by the community. One thing to keep in mind is if you use the work in progress mode within gerrit, non-admins can't see the change. Generally I just either mark them WIP in a topic, or put "work in progress" in the commit message somewhere. > > The build is not happy with it as I didn't use clang-format > conciously, the diff before it is much clearer (understandble) in this > particular case. Sure; obviously the patch will need to pass tests if it's to merge, but I don't think we're there yet. > > On Thu, Jul 29, 2021 at 09:08:30AM -0700, Ed Tanous wrote: > > On Thu, Jul 29, 2021 at 8:29 AM Paul Fertser <fercerpav@gmail.com> wrote: > > > So bmcweb can subscribe to the signals from entity-manager and > > > invalidate the caches just in case if any EM event registered. > > > > You have to monitor signals from all connections. bmcweb doesn't only > > run against entity-manager. FWIW, what you described is essentially > > entirely what the mapper does; > > See how little clue I have, that's exactly the reason I'm addressing > you and the other maintainers because I'm still unable to wrap my head > around all the EM and mapper interactions and what it really takes > _and why_ to get essential information about the system health. > > > > > As a general rule, bmcweb has avoided having a cache at all. There > > > > have been several attempts at adding one, but most failed because of > > > > the same thing you found: cache invalidation is hard. > > > > > > Even if it's just dropped on any ObjectMapper or EntityManager signal? > > > > Those aren't the only signals you need to monitor. If you're > > interested in going down this route, I'd start by looking at the > > mapper implementation. You'd also have to handle connections, > > disconnections, and other stuff. At some point, what you've done > > starts to look a lot like the mapper. > > Understood. So if we have the mapper already why isn't it enough to > issue just one call to it, process the reply, and then request > directly the data bmcweb needs? Because the mapper interfaces are non-ideal for building Redfish resources, and we have to work around them a lot. As a first pass, I proposed a change to the mapper to add a new way to call GetSubTree that would at least deduplicate some of the calls: https://discord.com/channels/775381525260664832/855566794994221117/864536874894229505 > > > > > If we're going to implement a cache, I'd really like to see it done > > > > at a higher level (somewhere in the connection class) and done more > > > > generically than this is here, which would ensure that all endpoints > > > > are sped up, not just sensors. With that said, if this really > > > > solves the problem and solves it well, I can't very well ignore it. > > > > > > But is it really a problem? I'm still not sure; probably it's just > > > that we invented a weird usecase not worth optimising for? Have the > > > others faced it and do they consider it to be a problem? > > > > It's for sure a problem, although your 4 second response times sound > > worse than most. > > It's usually about 2 seconds, and with my silly caching I cut it down > to half a second (four times). That's really good profiling information still. I haven't really measured the impact, but it makes me think I'm on the right track trying to optimize mapper calls a bit. > > > On an ast2500 I've normally seen a 700ms (ish) response time, but > > depending on how many sensors you have, and how fast the backends > > are for each one, I could easily see it getting up there pretty > > easily. > > PSUSensor backend is caching hwmon values so it's reasonably fast. ..... This is the way it _should_ be. Unfortunately PSUsensor has some really bad behaviors in terms of blocking dbus transactions while pmbus reads are happening. There's been a couple proposed fixes to this, but nothing has landed yet. > And > in my first mail I provided detailed timings of each step bmcweb > performs, so it's clear it doesn't add unreasonably much. > > The system in question (Tioga Pass) shows 153 sensors (with dual > 26-cores Xeons running). > > > Just one thing to double check; You don't have any custom sensor > > daemons running on your system do you? Ideally everything should be > > implementing object manager. If it doesn't, the code paths to build > > the sensor schema has to take the very slow path (ie calling Get on > > each property individually). > > In my first mail on the topic I detailed all the calls, and AFAICT > there's nothing custom and no Get requests at all. Can I be missing > anything? Good; So long as you're running upstream code we fixed this pattern a long time back. > > > > My real point is that apparently bmcweb is (in some cases) requesting > > > way too much data in way too many calls, using only a tiny fraction of > > > the obtained values in the end. That's why I made that lengthy > > > description of how data for a single sensor is obtained. Caching can > > > hide this to a degree, but shouldn't the root cause be fixed as well? > > > > Yes, we should come at it from both angles, because a cache will > > still have to invalidate quite a bit, and nobody likes a bimodal > > distribution of response times. > > This sounds cool but I'm not sure I understand the point. We face > caching everywhere, and it's usually multi-layered, so when > appropriately benchmarked we often see tri- and more modal > distributions. And when not building a hard real-time system the > median is much more important than the modality I guess? I'm not sure I'd say we have "caching everywhere". About the only thing we really cache today is the mapper, and that only really invalidates on startup. My point is really that if we implement a cache in bmcweb, we'll see one response that takes 200ms, then some responses will take 4 seconds when the cache isn't valid. > > > The next thing on my personal list is to get a new version of > > GetSubtree added to the mapper, that accepts a list of optional > > interfaces as well as required. I talked about this a while back on > > discord. > > The d-word offences me. Their ToS (and specifically the explicit > prohibition of "third-party" clients) make it outright absolutely > unacceptable to use the (mis)service. While I value your opinion, this isn't the right place to be bifurcating the discussion. If you have concerns about discord, feel free to bring them up in another thread and I'm sure you'll be heard. Let's keep on topic about making the sensor code faster. > > > This might clean up a couple of the extra calls in the > > sensor code, but the primary target was inventory initially. > > I'm still rather puzzled by the relationship between different > sensors/entities/inventory/leds/etc. Is there a document describing > everything that's currently essential? One systems "essential" is another systems "optional". This is the short way of saying, no, I don't think we make any distinction between essential and non-essential relationships on dbus. The best way to understand is to go look at the code like you've done. > > > In a perfect world where software is easy and architectures are > > beautiful, ideally I'd like to see most redfish calls go: > > Call mapper once with a list of interfaces; Find all connections and > > objects that implement this interface. > > Loop over connections, calling GetManagedObjects on each to read in > > the data and write directly to redfish tree. > > So does anyone know what exactly is not perfect about the current > world that makes bmcweb so convoluted? I'm going to assume you're talking about the Redfish interface specifically being complex in this case (there are a lot of other reasons why virtual media, KVM, and SOL are complicated). At the risk of making a generalization, I would say it's a combination of the Redfish spec being complex, and the dbus interfaces (including the mapper) being designed long before Redfish was a thing. This makes translating Dbus->Redfish conversion code non-trivial. We also have the constraint that we support a lot more system architectures than most codebases do, which leads to needing a lot of flexibility in what the Redfish tree looks like per-system. With that said, if you have any ideas for how to simplify, they'd be great to see (ideally in a patchset). > > > > Does bmcweb really _need_ to do all the calls mentioned? I think the > > > fact it performs two identical D-Bus calls in a row gives a clear "no" > > > answer. > > > > You are almost certainly right, two identical calls is wasteful, and > > we should clean that up if we can. I'm certain there's probably > > several speedups to be gained by someone understanding the sensor code > > and optimizing it a little. > > I'm happy I managed to get my point through finally. I'm looking forward to the patch. > > > > After reading my outline of the process can you tell that's > > > exactly the way the OpenBMC D-Bus API is designed to be used? > > > > I'm not sure I quite follow this question. The sensor code you're > > looking at evolved significantly over time into the behemoth you see > > today. It quite likely has several anti-patterns and duplicated calls > > that can be consolidated, but that didn't really show up in the > > individual diffs as features were added. > > That's perfectly understandable, happens to every software projects > that's not stale. Not only software even, our own bodies have some > rather "interesting" solutions too, that's a natural result of an > evolution. > > > I'm really happy someone has gone through the full flow. Ideally we > > can start cleaning up the obvious stuff (duplicated calls, ect) in > > some patchsets. > > Since I do not understand the role of /inventory/ pathes, and I do not > understand how LEDs get involved (apparently we do not have any LEDs > like that in our Tioga Pass configuration) and I didn't even try to > dig the "power" nodes I'm certainly not in a position to offer a > worthy code contribution in this area :( (other than fixing the > duplicated call which can trivially be done by passing an extra > argument; but that alone isn't fruitful much) Reading and understanding the code is probably the best way to become "worthy". It would be great to have even a simple change land on master as the result of this discussion that can start to work on your concerns. > > I very much hoped for more involvement from the bmcweb maintainers, as > solving this issue properly, in a way that doesn't break any usecases > obviously requires plenty of domain-specific knowledge (domain here > being the extensive OpenBMC D-Bus API, specifically the > sensors/inventory part of it). I'm not following here.... I (a bmcweb maintainer) have given you several pretty lengthy replies about things that could be worked on. Is there some other involvement you were hoping for that doesn't involve the maintainers just writing up and testing the patches? > > Guess I'm at fault here and doing something wrong, and it's not the > first time I fail to understand how to interact with the > community. E.g. I'm still waiting for an answer to a straightforward > question asked in > https://lists.ozlabs.org/pipermail/openbmc/2021-July/027001.html > . What am I doing wrong? Am I copying the wrong people? Am I asking at > the wrong place? Should I be expecting something else entirely? I have > some reasonable experience interacting with other Free Software > communities but somehow it doesn't seem to be applicable here? I'm guessing nobody knows the answer to that question, and very few people are interested in beep codes. If I had to guess, the BeepCode interface is only implemented in Intels fork, and probably could be removed from upstream until the backend is pushed; This is likely leftover cruft from pulling dbus-sensors into mainline, but it doesn't hurt anything, so nobody really noticed. > > Thank you for your time. > > -- > Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software! > mailto:fercerpav@gmail.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Redfish requests for dbus-sensors data are needlessly slow - analysis 2021-08-04 18:51 ` Ed Tanous @ 2021-08-05 18:57 ` Paul Fertser 2021-08-05 19:25 ` Ed Tanous 0 siblings, 1 reply; 9+ messages in thread From: Paul Fertser @ 2021-08-05 18:57 UTC (permalink / raw) To: Ed Tanous Cc: Zhikui Ren, Jae Hyun Yoo, Andrei Tutolmin, Vernon Mauery, OpenBMC Maillist, Ed Tanous, Timofei Nikitin, Jason Bills, Gunnar Mills Hello, On Wed, Aug 04, 2021 at 11:51:01AM -0700, Ed Tanous wrote: > One thing to keep in mind is if you use the work in progress mode > within gerrit, non-admins can't see the change. Generally I just > either mark them WIP in a topic, or put "work in progress" in the > commit message somewhere. Noted, thank you. > Because the mapper interfaces are non-ideal for building Redfish > resources, and we have to work around them a lot. Still mapper is aware of all the devices appearing and disappearing from the bus, and should be emitting signals accordingly. In what case invalidating all of the (hypothetical) bmcweb connections/mapping/associations cache on receiving such a signal would not be enough? > > PSUSensor backend is caching hwmon values so it's reasonably fast. > > ..... This is the way it _should_ be. Unfortunately PSUsensor has > some really bad behaviors in terms of blocking dbus transactions while > pmbus reads are happening. There's been a couple proposed fixes to > this, but nothing has landed yet. Haven't noticed that in my tests. I had an impression that all reading from hwmon files is performed in fully async mode, Boost::ASIO not blocking on anything and letting the other handlers run. > > > Yes, we should come at it from both angles, because a cache will > > > still have to invalidate quite a bit, and nobody likes a bimodal > > > distribution of response times. > > > > This sounds cool but I'm not sure I understand the point. We face > > caching everywhere, and it's usually multi-layered, so when > > appropriately benchmarked we often see tri- and more modal > > distributions. And when not building a hard real-time system the > > median is much more important than the modality I guess? > > I'm not sure I'd say we have "caching everywhere". About the only > thing we really cache today is the mapper, and that only really > invalidates on startup. My point is really that if we implement a > cache in bmcweb, we'll see one response that takes 200ms, then some > responses will take 4 seconds when the cache isn't valid. I thought "nobody likes a bimodal distribution of response times" referred to a general notion, so I thought about all the caching in computer systems on all the levels (CPU, RAM, disks, Ethernet etc etc). One particular example of caching we really dislike would be the one resulting in network bufferbloat, but that's not because of the modality. And regarding this specific question, I do not see anything particularly bad about having to wait for 4 seconds few times a day if thousands of requests are processed in 200 ms over the course of the same day. Better than having to wait 2 seconds for each of them every time. > > > This might clean up a couple of the extra calls in the > > > sensor code, but the primary target was inventory initially. > > > > I'm still rather puzzled by the relationship between different > > sensors/entities/inventory/leds/etc. Is there a document describing > > everything that's currently essential? > > One systems "essential" is another systems "optional". This is the > short way of saying, no, I don't think we make any distinction between > essential and non-essential relationships on dbus. The best way to > understand is to go look at the code like you've done. TBH, looking through the Boost::ASIO async code was painful and most importantly it often doesn't provide full rationale. I can eventually understand what is being done, but my ideas about why that is needed are speculations prone to errors, that's why in the absence of a design rationale document I was hoping to get answers from the authors of the code in question. > > > > Does bmcweb really _need_ to do all the calls mentioned? I think the > > > > fact it performs two identical D-Bus calls in a row gives a clear "no" > > > > answer. > > > > > > You are almost certainly right, two identical calls is wasteful, and > > > we should clean that up if we can. I'm certain there's probably > > > several speedups to be gained by someone understanding the sensor code > > > and optimizing it a little. > > > > I'm happy I managed to get my point through finally. > > I'm looking forward to the patch. I wonder what kind of testing would be needed to make sure it doesn't break all the systems I have no idea about. Those that have sub-sensors, complex associations and related LEDs, and probably those that are using phosphor-hwmon and not dbus-sensors... > > Since I do not understand the role of /inventory/ pathes, and I do not > > understand how LEDs get involved (apparently we do not have any LEDs > > like that in our Tioga Pass configuration) and I didn't even try to > > dig the "power" nodes I'm certainly not in a position to offer a > > worthy code contribution in this area :( (other than fixing the > > duplicated call which can trivially be done by passing an extra > > argument; but that alone isn't fruitful much) > > Reading and understanding the code is probably the best way to become > "worthy". It would be great to have even a simple change land on > master as the result of this discussion that can start to work on your > concerns. Thank you for the prompt review for my simple change, I'll be working on fixing the issues you raised. > > I very much hoped for more involvement from the bmcweb maintainers, as > > solving this issue properly, in a way that doesn't break any usecases > > obviously requires plenty of domain-specific knowledge (domain here > > being the extensive OpenBMC D-Bus API, specifically the > > sensors/inventory part of it). > > I'm not following here.... I (a bmcweb maintainer) have given you > several pretty lengthy replies about things that could be worked on. > Is there some other involvement you were hoping for that doesn't > involve the maintainers just writing up and testing the patches? Yes, and I appreciate it a lot. What I hoped for is for somebody who has all the knowledge about bmcweb Redfish implementation and OpenBMC architecture at her or his fingertips to chime in and say something like: "we need steps M and N for X and Y reasons, and step K was needed for Z but is redudant by now, and L should stay for a while until <some> cleanup is merged." Or "you're barking at the wrong tree, do /this/ and /that/ instead to monitor the health of your servers." I can be reading https://github.com/openbmc/docs/blob/master/architecture/object-mapper.md and https://github.com/openbmc/docs/blob/master/architecture/sensor-architecture.md and https://github.com/openbmc/docs/blob/master/architecture/LED-architecture.md for days, and the relevant source code, and experimenting on Tioga Pass, and digging all your hints, and reading all the conflicting changes that Gerrit nicely shows for my caching patch, but that won't ever give me the breadth of the insight into all possible cases (especially the edge cases) that those who developed and evolved this architecture have. Thank you for listening. -- Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software! mailto:fercerpav@gmail.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Redfish requests for dbus-sensors data are needlessly slow - analysis 2021-08-05 18:57 ` Paul Fertser @ 2021-08-05 19:25 ` Ed Tanous 0 siblings, 0 replies; 9+ messages in thread From: Ed Tanous @ 2021-08-05 19:25 UTC (permalink / raw) To: Paul Fertser Cc: Zhikui Ren, Jae Hyun Yoo, Andrei Tutolmin, Vernon Mauery, OpenBMC Maillist, Ed Tanous, Timofei Nikitin, Jason Bills, Gunnar Mills On Thu, Aug 5, 2021 at 11:57 AM Paul Fertser <fercerpav@gmail.com> wrote: > > Hello, > > On Wed, Aug 04, 2021 at 11:51:01AM -0700, Ed Tanous wrote: > > One thing to keep in mind is if you use the work in progress mode > > within gerrit, non-admins can't see the change. Generally I just > > either mark them WIP in a topic, or put "work in progress" in the > > commit message somewhere. > > Noted, thank you. > > > Because the mapper interfaces are non-ideal for building Redfish > > resources, and we have to work around them a lot. > > Still mapper is aware of all the devices appearing and disappearing > from the bus, and should be emitting signals accordingly. In what case > invalidating all of the (hypothetical) bmcweb > connections/mapping/associations cache on receiving such a signal > would not be enough? "should be" does not always make it so :) The mapper has its own signal that it uses that I suspect isn't fine-grained enough to invalidate a cache. Would it be enough to maintain a stable cache? It would probably be a good start, and we could certainly discuss the pros and cons on your patchset, but to reiterate, I don't want to maintain a mapper cache just for sensors, and I want to have a good understanding of how much memory said cache would use for a fully loaded system to make sure we don't overrun our targets. > > > > PSUSensor backend is caching hwmon values so it's reasonably fast. > > > > ..... This is the way it _should_ be. Unfortunately PSUsensor has > > some really bad behaviors in terms of blocking dbus transactions while > > pmbus reads are happening. There's been a couple proposed fixes to > > this, but nothing has landed yet. > > Haven't noticed that in my tests. I had an impression that all reading > from hwmon files is performed in fully async mode, Boost::ASIO not > blocking on anything and letting the other handlers run. In practice, async_read is not async under the hood for the file descriptor types we're using, and ends up blocking on a read call. io_uring or pushing it all to its own thread has been proposed as solutions to this. There's some old threads on this in lore if you're interested in trying to work on this. > > > > > Yes, we should come at it from both angles, because a cache will > > > > still have to invalidate quite a bit, and nobody likes a bimodal > > > > distribution of response times. > > > > > > This sounds cool but I'm not sure I understand the point. We face > > > caching everywhere, and it's usually multi-layered, so when > > > appropriately benchmarked we often see tri- and more modal > > > distributions. And when not building a hard real-time system the > > > median is much more important than the modality I guess? > > > > I'm not sure I'd say we have "caching everywhere". About the only > > thing we really cache today is the mapper, and that only really > > invalidates on startup. My point is really that if we implement a > > cache in bmcweb, we'll see one response that takes 200ms, then some > > responses will take 4 seconds when the cache isn't valid. > > I thought "nobody likes a bimodal distribution of response times" > referred to a general notion, so I thought about all the caching in > computer systems on all the levels (CPU, RAM, disks, Ethernet etc > etc). One particular example of caching we really dislike would be the > one resulting in network bufferbloat, but that's not because of the > modality. Yep, in this case I'm specifically talking about your proposed change. > > And regarding this specific question, I do not see anything > particularly bad about having to wait for 4 seconds few times a day if > thousands of requests are processed in 200 ms over the course of the > same day. Better than having to wait 2 seconds for each of them every > time. if these were real world numbers, I'd probably agree with you. My point is that the mapper is a trivial piece of software that largely does nothing but return responses from an in-memory buffer. It should not be the bottleneck, and if it is, we should understand why before we implement caching. > > > > > This might clean up a couple of the extra calls in the > > > > sensor code, but the primary target was inventory initially. > > > > > > I'm still rather puzzled by the relationship between different > > > sensors/entities/inventory/leds/etc. Is there a document describing > > > everything that's currently essential? > > > > One systems "essential" is another systems "optional". This is the > > short way of saying, no, I don't think we make any distinction between > > essential and non-essential relationships on dbus. The best way to > > understand is to go look at the code like you've done. > > TBH, looking through the Boost::ASIO async code was painful and most > importantly it often doesn't provide full rationale. I can eventually > understand what is being done, but my ideas about why that is needed > are speculations prone to errors, that's why in the absence of a > design rationale document I was hoping to get answers from the authors > of the code in question. > > > > > > Does bmcweb really _need_ to do all the calls mentioned? I think the > > > > > fact it performs two identical D-Bus calls in a row gives a clear "no" > > > > > answer. > > > > > > > > You are almost certainly right, two identical calls is wasteful, and > > > > we should clean that up if we can. I'm certain there's probably > > > > several speedups to be gained by someone understanding the sensor code > > > > and optimizing it a little. > > > > > > I'm happy I managed to get my point through finally. > > > > I'm looking forward to the patch. > > I wonder what kind of testing would be needed to make sure it doesn't > break all the systems I have no idea about. Those that have > sub-sensors, complex associations and related LEDs, and probably those > that are using phosphor-hwmon and not dbus-sensors... > > > > Since I do not understand the role of /inventory/ pathes, and I do not > > > understand how LEDs get involved (apparently we do not have any LEDs > > > like that in our Tioga Pass configuration) and I didn't even try to > > > dig the "power" nodes I'm certainly not in a position to offer a > > > worthy code contribution in this area :( (other than fixing the > > > duplicated call which can trivially be done by passing an extra > > > argument; but that alone isn't fruitful much) > > > > Reading and understanding the code is probably the best way to become > > "worthy". It would be great to have even a simple change land on > > master as the result of this discussion that can start to work on your > > concerns. > > Thank you for the prompt review for my simple change, I'll be working > on fixing the issues you raised. > > > > I very much hoped for more involvement from the bmcweb maintainers, as > > > solving this issue properly, in a way that doesn't break any usecases > > > obviously requires plenty of domain-specific knowledge (domain here > > > being the extensive OpenBMC D-Bus API, specifically the > > > sensors/inventory part of it). > > > > I'm not following here.... I (a bmcweb maintainer) have given you > > several pretty lengthy replies about things that could be worked on. > > Is there some other involvement you were hoping for that doesn't > > involve the maintainers just writing up and testing the patches? > > Yes, and I appreciate it a lot. > > What I hoped for is for somebody who has all the knowledge about > bmcweb Redfish implementation and OpenBMC architecture at her or his > fingertips to chime in and say something like: "we need steps M and N > for X and Y reasons, and step K was needed for Z but is redudant by > now, and L should stay for a while until <some> cleanup is merged." Or > "you're barking at the wrong tree, do /this/ and /that/ instead to > monitor the health of your servers." Considering I was commit 1 on bmcweb, I like to think I have some knowledge, but if I had that level of detail on what needed to be done, I would've written the patch a long time ago to fix this. The next steps are largely what I proposed originally; Someone needs to profile the code to understand which parts are slow, find where we're being inefficient, or where we can do fewer calls, or where the easy things are. If the conclusion is that we need a cache, that's a big piece of effort that someone would need to champion. > > I can be reading > https://github.com/openbmc/docs/blob/master/architecture/object-mapper.md > and > https://github.com/openbmc/docs/blob/master/architecture/sensor-architecture.md > and > https://github.com/openbmc/docs/blob/master/architecture/LED-architecture.md > > for days, and the relevant source code, and experimenting on Tioga > Pass, and digging all your hints, and reading all the conflicting > changes that Gerrit nicely shows for my caching patch, but that won't > ever give me the breadth of the insight into all possible cases > (especially the edge cases) that those who developed and evolved this > architecture have. > > Thank you for listening. > > -- > Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software! > mailto:fercerpav@gmail.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Redfish requests for dbus-sensors data are needlessly slow - analysis 2021-07-29 14:53 ` Ed Tanous 2021-07-29 15:29 ` Paul Fertser @ 2021-08-03 3:58 ` Andrew Jeffery 1 sibling, 0 replies; 9+ messages in thread From: Andrew Jeffery @ 2021-08-03 3:58 UTC (permalink / raw) To: Ed Tanous, Paul Fertser Cc: Zhikui Ren, Jae Hyun Yoo, Andrei Tutolmin, Vernon Mauery, OpenBMC Maillist, Timofei Nikitin, Bills, Jason M, Gunnar Mills On Fri, 30 Jul 2021, at 00:23, Ed Tanous wrote: > On Thu, Jul 29, 2021 at 2:46 AM Paul Fertser <fercerpav@gmail.com> wrote: > > I would be willing to work on doing the right thing but for that I'm > > lacking the understanding of the complete picture involved in handling > > all types of sensors and interfaces, so I'm kindly asking for your > > help with it. > > > > diff --git a/redfish-core/lib/sensors.hpp b/redfish-core/lib/sensors.hpp > > Can you please submit the below to gerrit as a WIP instead. There's > much better tooling there for reviewing and testing patches. I can > review it more there. > > FWIW, it's on my list to look into the mapper shared-memory caching > layer that @arj wrote a while back. It might also be able to solve > this problem; I don't have the link offhand, but I think it was > trying to solve a similar problem, but handled the cache invalidation > issue as well. Here's the link: https://github.com/amboar/shmapper FWIW it's not a caching layer - mapper clients directly access the mapper data structures through the shared memory mapping. The client APIs are all 'const' so you have to go out of your way to damage the data (or we can provide private-copy based APIs). It would be cool to get to the point where applications can push their DBus schema directly into the shmapper daemon. This would remove the performance horrors associated with the introspection phase of the current implementation. Anyway, unfortunately I haven't had time to hack on it recently. The current codebase is mostly the result of a hyperfocus brain-dump of ideas underpinned by some fuzzing. Performance and ergonomics can be improved a bit. The underlying ideas are also quite similar to Boost's Interprocess module[1], though implemented in C leveraging the sparse semantic parser[2] (which I think makes it easier to take advantage of than a Boost implementation, but there are trade-offs). [1] https://www.boost.org/doc/libs/1_76_0/doc/html/interprocess.html [2] https://sparse.docs.kernel.org/en/latest/ Andrew ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-08-05 19:26 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-29 9:46 Redfish requests for dbus-sensors data are needlessly slow - analysis Paul Fertser 2021-07-29 14:53 ` Ed Tanous 2021-07-29 15:29 ` Paul Fertser 2021-07-29 16:08 ` Ed Tanous 2021-08-04 10:41 ` Paul Fertser 2021-08-04 18:51 ` Ed Tanous 2021-08-05 18:57 ` Paul Fertser 2021-08-05 19:25 ` Ed Tanous 2021-08-03 3:58 ` Andrew Jeffery
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).